so_logo.png

Best Practices for List Processing

List processing is one of the many options Service Objects offers for validating your data. This option is ideal for validating large sets of existing data when you’d rather not set up an API call or would simply prefer us to process the data quickly and securely. There is good reason to have us process your list: we have high standards for security and will treat a file with the utmost care.

As part of our list processing service, we offer PGP encryption for files, SFTP file transfers, and encryption to keep your data private and secure. We also have internal applications that allow us to process large lists of data quickly and easily. We have processed lists ranging from tens of thousands of records to upwards of 15 million records. Simply put, we consider ourselves experts at processing lists, and we’ll help ensure that your data gets the best possible return available from our services.

That said, a few steps can help guarantee that your data is processed efficiently. For the best list processing experience – and the best data available, we recommend following these best practices for list processing.

CSV preparation

Our system processes CSV files. We will convert any file to the CSV format prior to list processing. If you want to deliver a CSV file to us directly, keep the following CSV preparation best practices in mind:

Processing international data – If you have a list of international data that needs to be processed, make sure the file has the right encoding. For example, if the original set of data is in an Excel spreadsheet, converting it to a CSV format can destroy foreign characters that may be in your file. When processing a list of US addresses, this may not be an issue but if you are processing an International set of addresses through our DOTS Address Validation International service, then something like this could highly impact your file. One workaround is to save the file as Unicode text through Excel and then set the encoding to UTF-8 with BOM through a text editor. Another option is to send us the Excel file with the foreign characters preserved and we will convert it to CSV with the proper encoding.

Preventing commas from creating unwanted columns – Encapsulating a field containing commas inside quotation marks will prevent any stray commas from offsetting the columns in your CSV file. This ensures that the right data is processed when our applications parse through the CSV file.

Use multiple files for large lists

When processing a list with multiple millions of records, breaking the file into multiple files of about 1 million records each helps our system more easily process the list while also allowing for a faster review of the results.

Including a unique ID for each of the records in your list helps when updating your business application with the validated data.

Configure the inputs for the service of choice

Matching your input data to ours can speed up list processing time. For example, some lists parse address line 1 data into separate fields (i.e., 123 N Main St W would have separate columns for 123, N, Main, St, and W). DOTS Address Validation 3 currently has inputs for BusinessName, Address1, Address2, City, State and Zip.  While we can certainly manipulate the data as needed, preformatting the data for our validation service can improve both list processing time and the turnaround time for updating your system with freshly validated data.

These best practices will help ensure a fast and smooth list processing experience. If you have a file you need cleansed, validated or enhanced, feel free to upload it here.