Loren Christiansen
- Aug 28, 2023
- 11 min read

The Ultimate Guide to Salesforce Troubleshooting

Avoid rabbit holes and solve errors quickly.

Salesforce can make life so easy for you and the end users in your org. However, a big flashing error can ruin all of that in a second. If you're anything like me, you've spent countless hours on a problem that ended up being a simple fix. Below you'll find a framework to understand and efficiently solve issues that plague your org.

Understanding Types of Errors

You will encounter a bunch of different types of errors while working in a Salesforce Org. It's important to understand the differences between each type because it can give you clues about how to get to the root of the problem. Note that the names I'm using to refer to different error types aren't official Salesforce names, just descriptors to help keep everything straight.

Hard Errors

These types of errors are the most critical because they block a user from progressing. They could make it impossible to save a record, cause a flow to fail, or something else that stops a process from continuing. Sometimes these are on purpose, and other times there's an issue with the underlying automation.

Thankfully, these errors are typically easier to solve because they're very obvious. End users will report them to you or you'll get emails about them if you're on the list for exception emails.

The errors that are thrown on purpose are easy to recognize. Here are a few:

Duplicate rules with the 'Block' checkbox checked. The error message will be accompanied with a "Duplicates Detected" link, which is a dead giveaway that you need to check the duplicate rules. If the record isn't truly a duplicate, find the duplicate rule and tweak the criteria. Alternatively, you can set the duplicate rule to 'Report' instead of 'Block,' which will track the duplicate on a report while still allowing the user to create/edit the record.
Fields set as required on the Page Layout or in the field settings. As a best practice, fields should only be marked as required in field settings (Object Manager -> Object -> Fields & Relationships -> Field Name -> Required checkbox) if the field is absolutely essential for every single record. Being trigger happy with this required checkbox can cause a lot of headaces during data imports or integrations because of the large number of fields to keep track of. Making a field required on the page layout is more manageable long term as your org grows in complexity.
Validation rules. The error message for these will say the word 'Validation.' Check the validation rules for the object and review the criteria to ensure it's correct.
Login Issues. A user, for obvious reasons, will be blocked from continuing if they forget their password or don't have access to the device they use for MFA. You can solve these on the User's Detail page.

Hard errors that are thrown because of broken automation are much more difficult to solve because the metadata involved is different in every org. Strategies to solve these will be discussed further below.

Soft Errors

Automation may not throw any errors messages, but the process could still be broken. These types of errors cause bad/confusing data. A simple example of this could be a currency formula field used for reporting that has incorrect logic. The field is working just fine, but all the reports will be incorrect. A more complex example is a screen flow that confirms to the user it's making changes, but nothing is actually happening.

These types of errors can be found through user reports or (preferably) careful testing before the release of any new metadata.

Permission/Security Errors

These are related to soft errors, but they deserve to be called out specifically because the troubleshooting is typically different. Overly restrictive sharing settings can hide important records or objects from users. Profile permissions can also cause problems when onboarding new users if they don't have access to the correct Apps, Objects, Fields, etc.

Integration Errors

Integrations are really useful when they work, but can bring an org to a grinding halt if they break. The errors can be related to any of the other error types, and are highly dependent on how the integration was set up.

Performance/System Errors

Certain bulk processes can exceed org limits or cause issues with the database. For example, record locking errors can occur when updating a large number of child records at once without following data import best practices.

Apex CPU timeout errors are caused by poor design of Flows, Process Builders, Apex, and other automation tools.

General Approach to Troubleshooting

There are some rules for troubleshooting that apply no matter what system you're working in. If you're looking for Salesforce-specific tactics, skip to the next section. In general, approach software problems by identifying a repeatable problem, determining root cause, establishing scope, and fixing the problem with an update to the appropriate metadata.

Identify a Repeatable Problem

Reports of an error can come from a variety of different places - user reports, automated emails, custom error handling, error messages, etc. The first task should always be to clearly define the problem and establish when it occurs. Avoid using shop talk with end users (ex. Going into the details of a complicated flow, using unfamiliar words like Apex Triggers, Validation Rules, Lightning Components, etc.), but always seek to understand what's going on under the hood.

If the problem is intermittent, more work needs to be done to determine the exact conditions when an error occurs. Despite how it appears, the system isn't ambiguous. Software can be buggy and error-prone, but it's extremely unlikely that it's truly random. Seek to be able to consistently replicate the conditions in a test environment (like a sandbox) that cause the error before troubleshooting. A good test to see if you've found the right conditions is to think about how you would explain the problem to a coworker. The explanation should sound like, "When I do X under Y conditions, I get Z error." If it's not that straightforward, do some more digging or ask for help.

Note that problems do occur with the Salesforce Platform, and you may need to log a case with technical support, but you should be very sure that the problem couldn't be anything else before doing so.

If you don't deeply understand the conditions that cause an error to occur, chances are higher that you will waste time going down technical rabbit holes that are unrelated to the actual problem.

Determine Root Cause

When you've discovered how to replicate the problem in isolation, start hunting for a root cause. Pattern recognition can be helpful here. If you've identified that the error occurs to a certain set of data or only with a certain set of users, those are clues about where to look. Check any automation that touches the data, check user permissions, do research on the specific error messages or codes, etc. Salesforce-specific methods will be discussed in the next section.

This step is important because it identifies what needs to be fixed. It's much easier to fix something when you know what's causing the problem. This is obvious in the physical world when you need to fix a car, phone, or other device, but it can get lost in software. Always look for specific metadata that is causing the problem, and document the reasons why.

Keep an open mind, and take note of anything that you can't explain. Anyone that's worked with buggy software will tell you that the cause of the problem is sometimes completely unrelated to what you originally thought!

As a real life example, I was once informed about an Activity Dashboard that was supposed to measure how many calls were made each week, but it was counting down as the week went on. The reporting user would write down their call number at the end of the day, and then the next day the number was lower. I initially thought that the report filters weren't tuned correctly, and that maybe the report was measuring a rolling 7 days instead of each calendar week. Nope! Everything on the report looked perfect. After a couple more days of watching the report and checking all activity records created by the user, I eventually figured out that a third-party tool the org was connected to had a nightly automation with some incorrect logic that was updating certain call records to have a subtype of 'Email' instead. Since the report was only looking at calls, records appeared to disappear overnight! This was a perfect example of the soft error type discussed above. While the process was 'working' and there were no errors, the logic was clearly wrong, and it made users distrust Salesforce as a source of truth.

Establish Scope

Now that you've determined what's causing the problem, it's also important to identify how many records or processes have been impacted. You'll most likely have stakeholders invested in what you're troubleshooting, so communicate with them once you understand the problem and its scope.

Aim to identify how many records are currently impacted, and how many will be impacted in the future if no action is taken. These are both important because they can help you or your team prioritize your approach to fixing the problem. For example, if the problem only happens every once in a while, but there is a high number of currently impacted records, it may make more sense to dedicate more resources to fixing the data so that users stop seeing the problem. Other times, the problem is happening so frequently (or completely preventing a process from completing) that it's better to stop the bleeding and fix the underlying issue before repairing data.

Establishing scope should happen after the problem is identified because it's much easier to make educated guesses about what data could be impacted when you know the source of the issue.

Fix the Problem

If all of the above has been completed, this step should hopefully be easy. You obviously need to actually make the changes that will stop the errors from occurring in the future. Work carefully with stakeholders and team members to create a solution that doesn't introduce even more errors. If possible, use version control systems and backups so that it's possible to revert if there are more problems after the fix.

Typically there will be two separate tasks when resolving an error:

Changing metadata so the problem stops (Permissions, code, config, etc.)
Updating data to fix any problems that the issues caused (Specific field updates, creating missing records, deleting invalid records, etc.)

To put it simply, you need to fix the broken process and fix anything the process messed up along the way. If you know the root cause and scope, this should be a breeze. Again, Salesforce-specific tactics will be discussed below.

Salesforce Troubleshooting Techniques

Let's return to Salesforce to get some practical examples of everything discussed above. Thankfully, the developers at Salesforce have created a bunch of tools that make troubleshooting easier.

Sandboxes

This is probably the most useful tool in your troubleshooting toolbelt. Especially for large, complex problems, the data in Production is too important to just start making changes to try and identify what's causing the problem. Creating a Sandbox (Setup -> Sandboxes) of the environment that is having the errors gives you a exact replica of everything that could be causing the problem, and you're free to test without compromising data integrity. If needed, create test data in your sandbox that mimics the data that initially caused the problem, and start trying to recreate the error.

In order for Sandboxes to be effective in troubleshooting, it's important to keep them refreshed. If you have full or partial sandboxes where new development is being worked on, consider keeping a Developer Sandbox that's only used for troubleshooting that can be refreshed as soon as a problem is identified. This can work with Full/Partial sandboxes, but it's more risky because the refresh limits(# of days in between refreshes) are longer, and you may not be able to refresh them as frequently as errors occur.

Note that Sandboxes are excellent for determining root cause, but establishing scope is better done in Production (or, whatever environment has the problem. You may be troubleshooting an error in a full sandbox full of new development). Production has your actual data, and running queries/reports to check for data quality problems won't break anything or impact user experience. Find the problem in the Sandbox, but check for impacted records in Production.

Login as a User

This is another absolutely essential tool. Always log in as the impacted user so that you can see what they see. It's easy to go through a flow and declare, "It works!" but miss that the user you're building it for doesn't have write access to a key field, so the whole thing breaks when you deploy it in production.

Login as the user to understand the problem and to test the solution. A sandbox works great for this. Deploy the solution in the sandbox, log in as the user, and then try to break the solution. Test the original cause of the problem, but also attempt to break other parts of the process. Enter bad field data, try to skip steps or go backwards when you shouldn't, etc. All of these errors will come up eventually in a live environment, so test them all before you release a solution.

Learn how to login as a user by clicking here.

Logs & Debugging

Logs are hugely important to software troubleshooting. Almost any action you take in Salesforce (even viewing a record!) generates a log that records what happened in what order. Learning to read logs is a skill that could have an entire post dedicated to it, so if you've never used them, it's a good idea to do some research. Click here for a good starting point.

To view a log, open the Developer Console, perform the action you're troubleshooting, and then double click on the generated log that appears in the 'Log' tab.

Filter for error statements, or find what automation is being triggered in what order by going through each line of the log. Note that you set the 'log level,' or amount of information recorded to make reading through the log easier. Again, there's a lot to this, so use the link above to look at the options.

If you're using Salesforce Flow, the 'Debug' button is an excellent way to accomplish the same thing. Instead of picking through a long log file, the debug button executes your flow in real time and gives you the steps in a more readable format. If you aren't using it, start today! You can run flows on specific test records, read flow variable changes for each flow element, and more. Thank me later!

If you're working with Apex, the Salesforce extensions for VS Code have excellent real-time debugging. Check them out here.

Trace Flags

If you need to look at logs over longer periods of time, use trace flags to save logs for a certain time period for specific users, apex classes, or apex triggers. You pick the specific entity to monitor, the start and end dates for monitoring, and the debug level(similar to the 'log level' mentioned in the section above).

Trace flags are invaluable for a problem that's intermittent or only occurs at a certain time of the day. They give you a group of logs containing the error that you can use to identify root cause. Combine this with a sandbox where you can freely test time-based automation, and you should be much closer to finding a solution.

Documentation

A solved problem can quickly resurface if the solution isn't documented. Whether you're creating or updating metadata, ensure that you populate the description fields with useful information. People will have to work with what you build, and what seems simple today could be confusing to someone 6 months from now. If you haven't already, you'll soon have the experience of slogging through a problem and thinking, "Who could have designed such a poor system? Oh...it was me."

If your organization uses documentation outside of Salesforce or a ticketing system to record issue resolution, be sure to carefully record the steps discussed above. Include the root case, the scope of impacted data, the proposed solution, and how well the execution of the solution went.

Change Management

Your underlying goal for troubleshooting should be to increase user confidence in Salesforce and the data in contains. If users don't trust Salesforce, adoption rates will be low and data quality will suffer. Fixing errors can raise user confidence, but changing a process without clear communication can also make people uneasy. If you can fix the problem without changing anything about the process or UI, great! Get approval to make the change, and push it. However, if anything is going to be changing for the end user, make sure to notify them directly in advance.

This isn't nearly as difficult as solving the problem itself, but in many cases it can be just as important. A good system is useless if no one uses it. Communicate early and often about what changes are going to occur. If necessary, provide training, demos, and release dates so that everyone is on the same page about what's going to happen.

Conclusion

Troubleshooting is more of an art than a science. A combination of curiosity, platform knowledge, collaboration, and persistence will solve most problems quickly. If any of these tips have been helpful, I'd love to hear about it! Drop me a note on the 'Contact' tab with any stories you'd like to share. Additionally, if I missed anything, let me know!

-Loren