{"id":1251,"date":"2017-01-26T18:00:50","date_gmt":"2017-01-26T18:00:50","guid":{"rendered":"https:\/\/gosqeng.test\/?p=1251"},"modified":"2019-11-28T11:19:59","modified_gmt":"2019-11-28T11:19:59","slug":"csv-import","status":"publish","type":"post","link":"https:\/\/www.gosquared.com\/blog\/csv-import","title":{"rendered":"How manually importing CSV files informed our new import feature"},"content":{"rendered":"<p><img decoding=\"async\" src=\"https:\/\/static.gosquared.com\/images\/liquidicity\/17_01_26_csvimport_01.png\" alt=\"GoSquared CSV import\"\/><\/p>\n<p>When we first set out to build our CSV importer, we decided only to allow customers to upload their user CSVs from our front end to an AWS S3 bucket.<\/p>\n<p>However, rather than automatically importing the data, we decided to manually undertake the importing process ourselves.<\/p>\n<h2>Ship early. Ship often.<\/h2>\n<p>Why did we release a clearly-unfinished feature?<\/p>\n<p>Firstly, we wanted to make sure that the data customers imported into GoSquared was accurate and, secondly, it gave us an opportunity to understand better the variables of the data we would be receiving. What were the file sizes like? What type of character encoding should we expect? What types of delimiters were being used (comma, semicolon, or pipe?) How long would the imports take? And what factors could make an import fail?<\/p>\n<p>Once we were able to identify the common patterns, we were able to build our back-end logic to handle the data and the edge cases that our research validated.<\/p>\n<h2>Deciding what decisions our customers would need to make<\/h2>\n<p>In the front-end we wanted to give our users the opportunity to customise the CSV before they imported it.<\/p>\n<p>Most CSVs will likely be an export of data from another application such as the user\u2019s previous sales CRM. Its naming convention may not be what our user want to have appear in their GoSquared account (e.g. \u201c_first_name\u201d rather than \u201cFirst Name\u201d) &mdash; or some of the data may not be relevant at all.<\/p>\n<h2>Defining the server-side architecture<\/h2>\n<p>For the architecture, we decided to build a lightweight Ruby on Rails service using its API configuration.<\/p>\n<p>This would receive a POST request internally from our front end servers with the ID of the user importing, the project they\u2019re importing to, the URL of the CSV file, and an array of headers that our they want to keep or ignore. From parsing the headers in the front-end, we could also validate the delimiter type by establishing the most common character and pass that on as a parameter.<\/p>\n<h3>Using ActiveJob<\/h3>\n<p>Additionally we noticed that the majority of the CSVs we had received had a lot of user data to import. We knew the files could take a while to import, and it wasn\u2019t critical for the import to be processed right away. Once our API received the request it would use the Rails&#8217;s built in ActiveJob framework to declare the job, which we configured to use with Sidekiq as the queuing back-end. <\/p>\n<h3>Sidekiq for background processing<\/h3>\n<p>Sidekiq is a gem that allows easy implementation of background processing for Ruby applications. It works by spawning multiple worker threads from a single master worker process and is a great way to offload long running tasks to the background.<\/p>\n<h3>Heroku to the rescue!<\/h3>\n<p>This overall setup allowed us to easily host our service on Heroku. As some of you will know, <a href=\"https:\/\/www.gosquared.com\/blog\/aws-summit-london-2015\" target=\"_blank\" rel=\"noopener noreferrer\">we use Amazon Web Services for almost everything<\/a> at GoSquared. But this was a project where we wanted to move quickly, and experiment with a few different technologies, and Heroku seemed the perfect fit.<\/p>\n<p>As Sidekiq interfaces with a Redis instance, after adding a RedisToGo addon to manage the Redis instances, we were able to get our service deployed in no time, all at a minimal cost.<\/p>\n<h2>How our CSV parsing logic was informed by manually importing user data<\/h2>\n<p>We used a lot of our manual import learnings to build the logic for parsing the CSV and mapping the data into a suitable structure that we could post to our Tracking API.<\/p>\n<p>One of the consequences of having clients based all around the world is that you tend to get words with diacritic characters (e.g. Pierr\u00e9) within the import. In a lot of cases that will mean that the CSV file is ISO8859-1 encoded and this needs to be converted UTF-8, otherwise it could raise an ArgumentError upon reading.<\/p>\n<p>You are able to handle this in Ruby by forcing the UTF-8 encoding when you read the file:<\/p>\n<pre class=\"language-ruby\">\n<code class=\"language-ruby\">@csv = CSV.read(path_to_csv, encoding: \"ISO8859-1:utf-8\")<\/code>\n<\/pre>\n<p>Also by doing the validation in the front end, we could just pass in the separator we identified from our params.<\/p>\n<p>Once we had the CSV in a readable format, we could then start to split out our data.<\/p>\n<p>We needed each row to be in key value pairs to be able to post to our tracking end point:<\/p>\n<pre class=\"language-ruby\">\n<code class=\"language-ruby\">{\"id\"=>\"19263\", \"email\"=>\"me@russellvaughan.com\", \"name\"=>\"Russell Vaughan\"}<\/code><\/pre>\n<p>We found it really useful to split out our headers as a separate object for a few reasons. Firstly, as the headers object would always be the same length as the rows below it, we could easily map the key value pairs based on the index.<\/p>\n<p>Also, by having the headers as an object, we could then use the headers object sent in as params in its place, featuring the changed header names. Any columns we wanted to skip, we could do so by just using a \u2018false\u2019 value in its position in the headers when we sent to our back-end service.<\/p>\n<pre class=\"language-ruby\">\n<code class=\"language-ruby\">@csv.each_with_index do |row, index|\n   properties = {}\n   row.length.times do |_number|\n    @headers.each_with_index do |h, i|\n      next if h == false\n      properties[headers[i].strip] = row[i].strip unless row[i].nil?\n    end\n  end\nend<\/code>\n<\/pre>\n<p>We were than able to take that object of key-value pairs and post to the <a href=\"https:\/\/www.gosquared.com\/docs\/api\/tracking\" target=\"_blank\" rel=\"noopener noreferrer\">GoSquared Tracking API<\/a>.<\/p>\n<h2>Using ActiveJob\u2019s callbacks<\/h2>\n<p>One of the nice things about ActiveJob is its use of callbacks. This meant that after the job completed we could make use of the \u201cafter perform\u201d method to post-track an event called \u2018csv import completed\u2019 against that user in our own <a href=\"https:\/\/www.gosquared.com\/crm\/\" target=\"_blank\" rel=\"noopener noreferrer\">GoSquared People<\/a> when the job had completed.<\/p>\n<h2>Dogfooding (or: drinking our own champagne)<\/h2>\n<p>Within our own GoSquared People, we were able to build a Smart Group, filtered by all users that had that event track against them; then, once that was saved, we could use our <a href=\"https:\/\/www.gosquared.com\/customer\/en\/portal\/articles\/2063698-drip\" target=\"_blank\" rel=\"noopener noreferrer\">Drip integration<\/a> to automate an email to these users, letting them know the job had been completed.<\/p>\n<p>Although it can be tempting to jump right into building a solution that will automate your process, by investing time in truly understanding the process of manually importing CSVs, we were able validate our assumptions and ultimately build a better feature, that would deliver more value to our users.<\/p>\n<h2>Want to try out the new import feature?<\/h2>\n<p>GoSquared is free to use for 14 days \u2013 try importing your users and see what you can learn about them with <a href=\"https:\/\/www.gosquared.com\/join\/crm\/\" target=\"_blank\" rel=\"noopener noreferrer\">our 14 day free trial<\/a>.<\/p>\n<h2>Further reading<\/h2>\n<p><a href=\"https:\/\/medium.com\/@GoSquared\/shipping-early-is-easy-shipping-often-is-hard-2a8c66b094e1#.y02i5m6lq\" target=\"_blank\" rel=\"noopener noreferrer\">Shipping early and shipping often<\/a><\/p>\n<p><a href=\"https:\/\/www.gosquared.com\/blog\/what-does-a-sales-engineer-do\">What does a sales engineer do?<\/a><\/p>\n<p><a href=\"https:\/\/www.gosquared.com\/blog\/t2-instances\">Running T2 instances on EC2<\/a><\/p>\n","protected":false},"excerpt":{"rendered":"<p>When we first set out to build our CSV importer, we decided only to allow customers to upload their user&#8230;<\/p>\n","protected":false},"author":25,"featured_media":0,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[1452],"tags":[1212,1216],"class_list":["post-1251","post","type-post","status-publish","format-standard","hentry","category-engineering","tag-ruby","tag-ruby-on-rails"],"yoast_head":"<!-- This site is optimized with the Yoast SEO Premium plugin v18.6 (Yoast SEO v19.0) - https:\/\/yoast.com\/wordpress\/plugins\/seo\/ -->\n<title>How manually importing CSV files informed our new import feature - GoSquared Blog<\/title>\n<meta name=\"robots\" content=\"index, follow, max-snippet:-1, max-image-preview:large, max-video-preview:-1\" \/>\n<link rel=\"canonical\" href=\"https:\/\/www.gosquared.com\/blog\/csv-import\" \/>\n<meta property=\"og:locale\" content=\"en_US\" \/>\n<meta property=\"og:type\" content=\"article\" \/>\n<meta property=\"og:title\" content=\"How manually importing CSV files informed our new import feature\" \/>\n<meta property=\"og:description\" content=\"When we first set out to build our CSV importer, we decided only to allow customers to upload their user...\" \/>\n<meta property=\"og:url\" content=\"https:\/\/www.gosquared.com\/blog\/csv-import\" \/>\n<meta property=\"og:site_name\" content=\"GoSquared Blog\" \/>\n<meta property=\"article:publisher\" content=\"https:\/\/www.facebook.com\/GoSquared\" \/>\n<meta property=\"article:published_time\" content=\"2017-01-26T18:00:50+00:00\" \/>\n<meta property=\"article:modified_time\" content=\"2019-11-28T11:19:59+00:00\" \/>\n<meta name=\"twitter:card\" content=\"summary_large_image\" \/>\n<meta name=\"twitter:image\" content=\"https:\/\/static.gosquared.com\/images\/liquidicity\/17_01_26_csvimport_twitter_01@2x.png\" \/>\n<meta name=\"twitter:creator\" content=\"@Russellvaughan\" \/>\n<meta name=\"twitter:site\" content=\"@GoSquared\" \/>\n<meta name=\"twitter:label1\" content=\"Written by\" \/>\n\t<meta name=\"twitter:data1\" content=\"Russell Vaughan\" \/>\n\t<meta name=\"twitter:label2\" content=\"Est. reading time\" \/>\n\t<meta name=\"twitter:data2\" content=\"6 minutes\" \/>\n<script type=\"application\/ld+json\" class=\"yoast-schema-graph\">{\"@context\":\"https:\/\/schema.org\",\"@graph\":[{\"@type\":\"Organization\",\"@id\":\"https:\/\/www.gosquared.com\/blog\/#organization\",\"name\":\"GoSquared\",\"url\":\"https:\/\/www.gosquared.com\/blog\/\",\"sameAs\":[\"https:\/\/instagram.com\/gosquaredteam\",\"https:\/\/www.linkedin.com\/company\/go-squared-ltd.\",\"https:\/\/www.facebook.com\/GoSquared\",\"https:\/\/twitter.com\/GoSquared\"],\"logo\":{\"@type\":\"ImageObject\",\"inLanguage\":\"en-US\",\"@id\":\"https:\/\/www.gosquared.com\/blog\/#\/schema\/logo\/image\/\",\"url\":\"https:\/\/www.gosquared.com\/blog\/wp-content\/uploads\/2015\/07\/gosquared.png\",\"contentUrl\":\"https:\/\/www.gosquared.com\/blog\/wp-content\/uploads\/2015\/07\/gosquared.png\",\"width\":1270,\"height\":250,\"caption\":\"GoSquared\"},\"image\":{\"@id\":\"https:\/\/www.gosquared.com\/blog\/#\/schema\/logo\/image\/\"}},{\"@type\":\"WebSite\",\"@id\":\"https:\/\/www.gosquared.com\/blog\/#website\",\"url\":\"https:\/\/www.gosquared.com\/blog\/\",\"name\":\"GoSquared Blog\",\"description\":\"Turn visitors into customers.\",\"publisher\":{\"@id\":\"https:\/\/www.gosquared.com\/blog\/#organization\"},\"potentialAction\":[{\"@type\":\"SearchAction\",\"target\":{\"@type\":\"EntryPoint\",\"urlTemplate\":\"https:\/\/www.gosquared.com\/blog\/?s={search_term_string}\"},\"query-input\":\"required name=search_term_string\"}],\"inLanguage\":\"en-US\"},{\"@type\":\"ImageObject\",\"inLanguage\":\"en-US\",\"@id\":\"https:\/\/www.gosquared.com\/blog\/csv-import#primaryimage\",\"url\":\"\",\"contentUrl\":\"\"},{\"@type\":\"WebPage\",\"@id\":\"https:\/\/www.gosquared.com\/blog\/csv-import#webpage\",\"url\":\"https:\/\/www.gosquared.com\/blog\/csv-import\",\"name\":\"How manually importing CSV files informed our new import feature - GoSquared Blog\",\"isPartOf\":{\"@id\":\"https:\/\/www.gosquared.com\/blog\/#website\"},\"primaryImageOfPage\":{\"@id\":\"https:\/\/www.gosquared.com\/blog\/csv-import#primaryimage\"},\"datePublished\":\"2017-01-26T18:00:50+00:00\",\"dateModified\":\"2019-11-28T11:19:59+00:00\",\"breadcrumb\":{\"@id\":\"https:\/\/www.gosquared.com\/blog\/csv-import#breadcrumb\"},\"inLanguage\":\"en-US\",\"potentialAction\":[{\"@type\":\"ReadAction\",\"target\":[\"https:\/\/www.gosquared.com\/blog\/csv-import\"]}]},{\"@type\":\"BreadcrumbList\",\"@id\":\"https:\/\/www.gosquared.com\/blog\/csv-import#breadcrumb\",\"itemListElement\":[{\"@type\":\"ListItem\",\"position\":1,\"name\":\"Home\",\"item\":\"https:\/\/www.gosquared.com\/blog\"},{\"@type\":\"ListItem\",\"position\":2,\"name\":\"How manually importing CSV files informed our new import feature\"}]},{\"@type\":\"Article\",\"@id\":\"https:\/\/www.gosquared.com\/blog\/csv-import#article\",\"isPartOf\":{\"@id\":\"https:\/\/www.gosquared.com\/blog\/csv-import#webpage\"},\"author\":{\"@id\":\"https:\/\/www.gosquared.com\/blog\/#\/schema\/person\/c5fbe35e2c32f13a6e05136b474f43b4\"},\"headline\":\"How manually importing CSV files informed our new import feature\",\"datePublished\":\"2017-01-26T18:00:50+00:00\",\"dateModified\":\"2019-11-28T11:19:59+00:00\",\"mainEntityOfPage\":{\"@id\":\"https:\/\/www.gosquared.com\/blog\/csv-import#webpage\"},\"wordCount\":1063,\"commentCount\":0,\"publisher\":{\"@id\":\"https:\/\/www.gosquared.com\/blog\/#organization\"},\"image\":{\"@id\":\"https:\/\/www.gosquared.com\/blog\/csv-import#primaryimage\"},\"thumbnailUrl\":\"\",\"keywords\":[\"Ruby\",\"Ruby on Rails\"],\"articleSection\":[\"Engineering\"],\"inLanguage\":\"en-US\",\"potentialAction\":[{\"@type\":\"CommentAction\",\"name\":\"Comment\",\"target\":[\"https:\/\/www.gosquared.com\/blog\/csv-import#respond\"]}]},{\"@type\":\"Person\",\"@id\":\"https:\/\/www.gosquared.com\/blog\/#\/schema\/person\/c5fbe35e2c32f13a6e05136b474f43b4\",\"name\":\"Russell Vaughan\",\"image\":{\"@type\":\"ImageObject\",\"inLanguage\":\"en-US\",\"@id\":\"https:\/\/www.gosquared.com\/blog\/#\/schema\/person\/image\/\",\"url\":\"https:\/\/secure.gravatar.com\/avatar\/822f03d63c83bea45c7827659fab4514c085e2647cc4d6145ead087e3cec4a56?s=96&d=mm&r=g\",\"contentUrl\":\"https:\/\/secure.gravatar.com\/avatar\/822f03d63c83bea45c7827659fab4514c085e2647cc4d6145ead087e3cec4a56?s=96&d=mm&r=g\",\"caption\":\"Russell Vaughan\"},\"description\":\"Russell is a sales engineer at GoSquared, focused on helping new customers get up and running with the GoSquared platform.\",\"sameAs\":[\"https:\/\/gosquared.com\",\"https:\/\/twitter.com\/Russellvaughan\"],\"url\":\"https:\/\/www.gosquared.com\/blog\/author\/russell\"}]}<\/script>\n<!-- \/ Yoast SEO Premium plugin. -->","yoast_head_json":{"title":"How manually importing CSV files informed our new import feature - GoSquared Blog","robots":{"index":"index","follow":"follow","max-snippet":"max-snippet:-1","max-image-preview":"max-image-preview:large","max-video-preview":"max-video-preview:-1"},"canonical":"https:\/\/www.gosquared.com\/blog\/csv-import","og_locale":"en_US","og_type":"article","og_title":"How manually importing CSV files informed our new import feature","og_description":"When we first set out to build our CSV importer, we decided only to allow customers to upload their user...","og_url":"https:\/\/www.gosquared.com\/blog\/csv-import","og_site_name":"GoSquared Blog","article_publisher":"https:\/\/www.facebook.com\/GoSquared","article_published_time":"2017-01-26T18:00:50+00:00","article_modified_time":"2019-11-28T11:19:59+00:00","twitter_card":"summary_large_image","twitter_image":"https:\/\/static.gosquared.com\/images\/liquidicity\/17_01_26_csvimport_twitter_01@2x.png","twitter_creator":"@Russellvaughan","twitter_site":"@GoSquared","twitter_misc":{"Written by":"Russell Vaughan","Est. reading time":"6 minutes"},"schema":{"@context":"https:\/\/schema.org","@graph":[{"@type":"Organization","@id":"https:\/\/www.gosquared.com\/blog\/#organization","name":"GoSquared","url":"https:\/\/www.gosquared.com\/blog\/","sameAs":["https:\/\/instagram.com\/gosquaredteam","https:\/\/www.linkedin.com\/company\/go-squared-ltd.","https:\/\/www.facebook.com\/GoSquared","https:\/\/twitter.com\/GoSquared"],"logo":{"@type":"ImageObject","inLanguage":"en-US","@id":"https:\/\/www.gosquared.com\/blog\/#\/schema\/logo\/image\/","url":"https:\/\/www.gosquared.com\/blog\/wp-content\/uploads\/2015\/07\/gosquared.png","contentUrl":"https:\/\/www.gosquared.com\/blog\/wp-content\/uploads\/2015\/07\/gosquared.png","width":1270,"height":250,"caption":"GoSquared"},"image":{"@id":"https:\/\/www.gosquared.com\/blog\/#\/schema\/logo\/image\/"}},{"@type":"WebSite","@id":"https:\/\/www.gosquared.com\/blog\/#website","url":"https:\/\/www.gosquared.com\/blog\/","name":"GoSquared Blog","description":"Turn visitors into customers.","publisher":{"@id":"https:\/\/www.gosquared.com\/blog\/#organization"},"potentialAction":[{"@type":"SearchAction","target":{"@type":"EntryPoint","urlTemplate":"https:\/\/www.gosquared.com\/blog\/?s={search_term_string}"},"query-input":"required name=search_term_string"}],"inLanguage":"en-US"},{"@type":"ImageObject","inLanguage":"en-US","@id":"https:\/\/www.gosquared.com\/blog\/csv-import#primaryimage","url":"","contentUrl":""},{"@type":"WebPage","@id":"https:\/\/www.gosquared.com\/blog\/csv-import#webpage","url":"https:\/\/www.gosquared.com\/blog\/csv-import","name":"How manually importing CSV files informed our new import feature - GoSquared Blog","isPartOf":{"@id":"https:\/\/www.gosquared.com\/blog\/#website"},"primaryImageOfPage":{"@id":"https:\/\/www.gosquared.com\/blog\/csv-import#primaryimage"},"datePublished":"2017-01-26T18:00:50+00:00","dateModified":"2019-11-28T11:19:59+00:00","breadcrumb":{"@id":"https:\/\/www.gosquared.com\/blog\/csv-import#breadcrumb"},"inLanguage":"en-US","potentialAction":[{"@type":"ReadAction","target":["https:\/\/www.gosquared.com\/blog\/csv-import"]}]},{"@type":"BreadcrumbList","@id":"https:\/\/www.gosquared.com\/blog\/csv-import#breadcrumb","itemListElement":[{"@type":"ListItem","position":1,"name":"Home","item":"https:\/\/www.gosquared.com\/blog"},{"@type":"ListItem","position":2,"name":"How manually importing CSV files informed our new import feature"}]},{"@type":"Article","@id":"https:\/\/www.gosquared.com\/blog\/csv-import#article","isPartOf":{"@id":"https:\/\/www.gosquared.com\/blog\/csv-import#webpage"},"author":{"@id":"https:\/\/www.gosquared.com\/blog\/#\/schema\/person\/c5fbe35e2c32f13a6e05136b474f43b4"},"headline":"How manually importing CSV files informed our new import feature","datePublished":"2017-01-26T18:00:50+00:00","dateModified":"2019-11-28T11:19:59+00:00","mainEntityOfPage":{"@id":"https:\/\/www.gosquared.com\/blog\/csv-import#webpage"},"wordCount":1063,"commentCount":0,"publisher":{"@id":"https:\/\/www.gosquared.com\/blog\/#organization"},"image":{"@id":"https:\/\/www.gosquared.com\/blog\/csv-import#primaryimage"},"thumbnailUrl":"","keywords":["Ruby","Ruby on Rails"],"articleSection":["Engineering"],"inLanguage":"en-US","potentialAction":[{"@type":"CommentAction","name":"Comment","target":["https:\/\/www.gosquared.com\/blog\/csv-import#respond"]}]},{"@type":"Person","@id":"https:\/\/www.gosquared.com\/blog\/#\/schema\/person\/c5fbe35e2c32f13a6e05136b474f43b4","name":"Russell Vaughan","image":{"@type":"ImageObject","inLanguage":"en-US","@id":"https:\/\/www.gosquared.com\/blog\/#\/schema\/person\/image\/","url":"https:\/\/secure.gravatar.com\/avatar\/822f03d63c83bea45c7827659fab4514c085e2647cc4d6145ead087e3cec4a56?s=96&d=mm&r=g","contentUrl":"https:\/\/secure.gravatar.com\/avatar\/822f03d63c83bea45c7827659fab4514c085e2647cc4d6145ead087e3cec4a56?s=96&d=mm&r=g","caption":"Russell Vaughan"},"description":"Russell is a sales engineer at GoSquared, focused on helping new customers get up and running with the GoSquared platform.","sameAs":["https:\/\/gosquared.com","https:\/\/twitter.com\/Russellvaughan"],"url":"https:\/\/www.gosquared.com\/blog\/author\/russell"}]}},"wps_subtitle":"Building a CSV import feature the lean way","_links":{"self":[{"href":"https:\/\/www.gosquared.com\/blog\/wp-json\/wp\/v2\/posts\/1251","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/www.gosquared.com\/blog\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.gosquared.com\/blog\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.gosquared.com\/blog\/wp-json\/wp\/v2\/users\/25"}],"replies":[{"embeddable":true,"href":"https:\/\/www.gosquared.com\/blog\/wp-json\/wp\/v2\/comments?post=1251"}],"version-history":[{"count":0,"href":"https:\/\/www.gosquared.com\/blog\/wp-json\/wp\/v2\/posts\/1251\/revisions"}],"wp:attachment":[{"href":"https:\/\/www.gosquared.com\/blog\/wp-json\/wp\/v2\/media?parent=1251"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.gosquared.com\/blog\/wp-json\/wp\/v2\/categories?post=1251"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.gosquared.com\/blog\/wp-json\/wp\/v2\/tags?post=1251"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}