What startups need to know about data science!

What startups need to know about data science!

For those of you embarking on or thinking of embarking on building a startup product but are concerned about not having data science built in to your solution then this article is for you. Obviously, I am referring to products or services that do not have data science, artificial intelligence, or machine learning as their core function, which is most of you.

Being in the field of data science and having worked with local startups, I often get asked how to enable data science [or artificial intelligence (Ai) or machine learning, all being relative synonyms for most people] for startup products or companies more generally. Unfortunately, the buzz has bitten the beast, so to speak, and in all too many cases people will nearly stop production because they get too caught up in the confusing mess of technology and statistics required to enable any real useful application of data science, let alone its computational partner Ai. I don’t say this to discourage but rather to help focus efforts where they matter most. In this post, we will examine the role that data science should play as you build your startup solution. To that end, we will discuss why your product doesn’t need data science yet, what to look for in your initial market testing for future data science opportunities, and how to begin to lay the groundwork for future data science integration.

Before going too much further, I want to pause briefly and quickly define what I mean by “data science.” Although the term is often obfuscated by associated technologies, when I refer to data science I am referring to the process of capturing data, transforming it so that it can be analyzed, using statistical models to find patterns in that data, and using those models to answer questions (e.g. make decisions). With this process we can answer complex questions that are challenging for humans to answer when the data are complex such as “Who is more at risk for a heart attack?” Or we can help machines to answer simple questions that are easy for humans but hard for machines such as “Is this a person or an animal?” But I digress, back to our discussion on data science and startups.

Why your startup doesn’t need data science…yet…

First, let’s consider how most startups work. Most startup companies get started because the inventors/creators have identified a human problem that they can solve with their own unique, and often combined experiences. What is very important to keep in mind here is that your solution, the one you developed without data science but experience (okay, maybe a little data science or research at least for the more rigorous of us), is solving a human problem without data science (*mind blown*). Seems obvious but it is fundamental. Second, when you keep this core focus in mind, you see that data science, just like everyone else trying to sell your new startup something it doesn’t need, is a distraction that is meant to tame your startup motivation. Moreover, this illusion is actively perpetuated by the giants who own solutions (touting the coolness of data science capabilities) in the fields you are looking to penetrate that ultimately leave you feeling as though you simply can’t compete without data science.

All hyperbole aside, the core message is important to repeat; your product was created sans data science and so should be brought to market sans data science. But that doesn’t mean that you can’t prepare for the future…

What to look for in your initial market release…

Once your product is ready for an alpha release, it now becomes important to address the future opportunities that data science may help to bring to your product. But how do you prepare for data science when you are still struggling to figure out what it means? Remember that data science can help us, us being people (owners, users, customers, etc) or machines (apps, robots, phones, etc), to answer questions. What this means is that you need to be sensitive to the questions that both you and your core customer base have as they experience your product.

Case in point, a startup develops an application that allows people to keep track of their college friends in one centralized location. Fast forward 14 years and Facebook is now a tech giant making strong and notable contributions to basic data science but by no means started there. What Zuckerberg did recognize was that his users had questions and he sought to identify ways through which the data he collected could help his users answer those questions (“Has anyone posted a photo of me?”, “If I have to see advertisements, what ads are most relevant to me?”, “Can’t Facebook just automatically tag my friends?”, etc.).

The take home for this section is to listen to your users as you roll out your product. Focus groups, surveys, emails, or any opportunity to receive feedback is an opportunity to add context to the continued evolution of your product or service. Examine the questions and challenges they have and consider whether your solution can collect the necessary information to possibly answer the question. If you identify some information that your product naturally collects from customers that may answer their question, then bingo…you have a data science use case. Thus, data science should be use-case driven such that each data science solution is attached to clear business value.

Okay, I have got my use cases, what now?

Although getting into specifics surrounding how to establish a data science pipeline like the one I describe at the beginning is beyond the scope of this article, I will leave you with a few ideas to consider along with some resources for digging deeper. The key to answering any question using data science starts with data (like it is literally the beginning of the phrase…duh). This means that you need to identify opportunities and some simple technologies to capture data.

Possible capture mechanisms include:

  • Relational Databases – SQLite, MySQL
  • Non-Relational Databases – MongoDB, PostgreSQL
  • File Systems – Basic Windows File system
  • Here is a useful description of some of the top open source DB solutions

Relational databases can be great if you know exactly what you want to capture but non-relational databases provide more flexibility for collecting information that has less structure. Finally, file systems (like the one on your PC where you save Word docs and family photos) can also be used but because they will capture anything, and they are not easy to extract information from, these may not be the best option. No matter which solution you choose, try to find one that allows you to automatically collect the information from your product or service. This ensures greater consistency in the data and reduces the potential problem of building biased insights for future analytics. In other words, promising me that you will remember to enter all those survey responses from customers and save them in a file somewhere probably isn’t a good data capture strategy. Once you have a good or even decent mechanism for capturing and saving data the remaining steps can get a bit complex and may require a more traditional data scientist consultant to build the insights you are interested in leveraging. It is important to note that at this point I am grossly oversimplifying the data science process but by the time you get to this point, hopefully you have generated enough revenue and identified enough high-value use cases that it will justify hiring some additional help. For those of you who are interested in more technical details around setting up a more robust data science pipeline, I highly recommend this blog series that teaches how to leverage cloud resources to execute an end-to-end data science pipeline.

Recognize the hurdles you must overcome before executing on data science in your products:

  • Hurdle 1: Do not over or under, but especially over in your early stages, -estimate the business value of data science
  • Hurdle 2: Be careful not to jump in without a defined plan and process
  • Hurdle 3: Keep in mind that collecting data means keeping information on people, so security and privacy will be important issues to address
  • Hurdle 4: When a high-value use case is identified, clearly define success metrics
  • Hurdle 5: Building data science requires some level of experience with data engineering, statistics, and scripting. Thus, it is essential to find a trusted partner to help enable your budding data science practice.

Thanks for reading and please feel free to reach out to let us know what you liked, didn’t like, or would like to see more of. We are particularly interested in any future content you would like us to examine so don’t be shy. Email (info@betacosine.com) or comment below.

29 thoughts on “What startups need to know about data science!

    1. Thanks! I appreciate the recognition! We are currently working on adding more content, so keep on the lookout before the end of the year.

  1. Have you ever thought about adding a little bit more
    than just your articles? I mean, what you say is fundamental and all.
    However imagine if you added some great photos or videos to give your posts more, “pop”!
    Your content is excellent but with images and clips, this site could
    undeniably be one of the most beneficial in its field. Fantastic
    blog! magliette calcio

    1. Appreciate the suggestion. We are in the works to develop more content so stay tuned in the coming months. In the meantime, feel free to fill out our form so that we can send email alerts regarding new content.

  2. Hello there! This blog post could not be written any better!

    Looking at this article reminds me of my previous roommate!
    He always kept preaching about this. I’ll forward this article to him.
    Pretty sure he will have a great read. I appreciate you for sharing!
    fodboldtrøjer

    1. Thanks for the encouragement! We are working on adding more content in the coming months. In the meantime, feel free to fill out our form so that we can send email alerts regarding new content.

    1. We are working on getting new content published in the coming months. In the meantime, feel free to fill out our form so that we can send email alerts regarding new content.

  3. Magnificent goods from you, man. I’ve understand your
    stuff previous to and you’re just extremely excellent.
    I actually like what you’ve acquired here, certainly like what you are saying and the way in which you say it.

    You make it entertaining and you still take care of to
    keep it smart. I cant wait to read far more from you. This
    is actually a terrific site. fotballdrakter LeannaosG maglie calcio poco
    prezzo Garygiib
    BookerRus Alla Fotbollströjor till Barn 2018, Billiga Fotbollströjor
    … SherrieDu

    1. Glad to see you find our content engaging! We are working on more so stay tuned. In the meantime, feel free to fill out our form so that we can send email alerts regarding new content.

  4. You are so interesting! I do not believe I have read
    through a single thing like that before. So good to find another person with a few unique thoughts on this issue.
    Seriously.. thanks for starting this up. This website
    is something that is required on the internet, someone with a bit of
    originality! fotbollströjor TamaraBor fussball trikot Concepcio
    EwanWrenn billige fodboldtrøjer WinnieReg

  5. Highly descriptive blog, I loved that a lot.

    Will there be a part 2? fußball trikot LucienneC billige fotballdrakter AlanaFull
    HermineMc Billiga Fotbollströjor Supporterprodukter På
    Nätet KaceyMcKe

  6. I’m not that much of a internet reader to be honest but your sites really
    nice, keep it up! I’ll go ahead and bookmark your website to come
    back later on. All the best fodboldtrøjer TUDShona maglie calcio
    DarellMar
    DoriePetr billige fodboldtrøjer CerysVale

  7. If you are going for most excellent contents like I do, only go to see this web site everyday as it gives feature
    contents, thanks magliette calcio KellyeBye fodboldtrøjer børn ArchieBlj
    EmelyWitm maglie calcio a poco prezzo SCBViola

  8. Good post. I learn something new and challenging on blogs I stumbleupon on a daily basis.

    It’s always interesting to read content from other authors
    and practice something from other sites. fodboldtrøjer
    børn CasieStur Nuova Maglia Juventus Bambino LadonnaCh
    LaneSpang billige fodboldtrøjer ElizabetK

Leave a Reply

Your email address will not be published. Required fields are marked *