![]() |
We don’t need “citizen data scientists”; we need “citizens of data science.” I’ve written several blogs and conducted numerous student and executive training sessions associated with getting business stakeholders to “think like a data scientist.” We are not trying to turn the business stakeholders into data scientists. Instead we want to train the business stakeholders in “thinking like a data scientist” – to become citizens of data science – by understanding where and how data science can impact their business in order to accelerate organizational adoption. The Thinking like a Data Scientist process has evolved over time. So I’m going to use this blog as an opportunity to document the refined (and hopefully simplified) process. The business stakeholder objectives for the “Thinking like a Data Scientist” methodology is:
The “Thinking like a Data Scientist” methodology has evolved as we’ve applied it across client engagements, and have learned what works and what doesn’t work. So I will use this blog to update the methodology and supporting materials (see flow below). I’ll also use this blog as an opportunity to pull together all the “Thinking like a Data Scientist” blogs into a single location. Besides, pulling all of these blogs in a single blog makes it easier for me when assigning reading to my University of San Francisco business students. Some Classroom PrerequisitesBefore we dive into the methodology, let’s start by defining data science: Data science is about identifying variables and metrics that might be better predictors of performance. It is the word “might” that is key to the “Thinking like a Data Scientist” process. The word “might” is a license to be wrong, a license to think outside the box when trying to identify variables and metrics that might be better predictors of performance. In order to create the “right” analytic models, the Data Science team will embrace a highly iterative, “fail fast / learn faster” environment. The data science team will test different variables, different data transformations, different data enrichments and different analytic algorithms until they have failed enough times to feel “comfortable” with the variables that have been selected. See the blog “Demystifying Data Science” to better understand the role of “might” in the data science process. Step 1: Identify Target Business InitiativeIf you want your data science effort to be relevant and meaningful to the business, start with a key business initiative. A key business initiative is characterized as:
Examples of key business initiatives could include:
These key business initiatives can be found in annual reports, analyst briefings, executive conference presentations, press releases, or maybe just ask your executives what are the organization’s most important business initiatives over the next 12 to 18 months (see Figure 1). ![]() Figure 1: Chipotle’s Annual Report and Their Key Business Initiatives Step 2: Identify Business StakeholdersStep 2 identifies the business stakeholders (and constituents) are those functions that either impact or are impacted by the targeted business initiative. These stakeholders and constituents are the targets for your “Thinking like a Data Scientist” training as they have the domain knowledge necessary to improve analytic model effectiveness and drive organizational adoption (see Figure 2). ![]() Figure 2: Identify Business Stakeholders or Constituents Ideally for each stakeholder or constituent, you would create a single-slide persona that outlines that stakeholder’s or constituent’s roles, responsibilities, decisions and pain points (see Figure 3). ![]() Figure 3: Business Stakeholder Persona Step 3: Identify Business EntitiesStep 3 identifies the business entities (sometimes called “strategic nouns”) around which we will create and capture analytic insights. Business entities include customers, patients, students, physicians, store managers, engineers, and agents. But business entities can also include “things” such as jet engines, wind turbines, trucks, cars, medical devices and even buildings (see Figure 4). ![]() Figure 4: Identify Key Business Entities (or Strategic Nouns)
Ideally the data science team will create an analytic profile for each individual business entity to help in the capture, refinement and re-use of the organization’s analytic insights. Analytic Profiles capture the organization’s analytic assets in a way that facilities the refinement and sharing of those analytic assets across multiple use cases (see Figure 5). ![]() Figure 5: Analytic Profiles
An Analytic Profile consists of metrics, predictive indicators, segments, scores, and business rules that codify the behaviors, preferences, propensities, inclinations, tendencies, interests, associations and affiliations for the organization’s key business entities. See the blog “Analytic Profiles: Key to Data Monetization” for more details on the workings of an Analytic Profile. Step 4: Brainstorm Data SourcesStep 4 is focused on leveraging the domain expertise of the business stakeholders to identify those variables and metrics (data sources) that might be better predictors of performance. To facilitate the brainstorming of data sources, we will take the business stakeholders through an exercise to convert some of their descriptive questions into predictive questions that support the targeted business initiative. That is, we will transition the stakeholders from asking descriptive questions about what happened, to ask predictive questions about what is likely to happen. Figure 6 shows an example of the “descriptive to predictive” questions conversion. ![]() Figure 6: Converting Descriptive Questions to Predictive Questions
We then take a couple of the most important predictive questions and add the following phrase to that predictive question in order to facilitate the data source brainstorming process: “…and what data sources might we need to make that prediction?” For example:
Then ask the stakeholders to work in small groups to identify and capture the potential data sources on Post It notes (one variable or data source per Post It note). We then bring all the stakeholders back together to create an aggregated list of potential variables and metrics (data sources) that the data science team might want to test (see Figure 7). ![]() Figure 7: Brainstorming Data Sources (Variables and Metrics)
After brainstorming the data sources, then the business stakeholders rank the data sources for each use case based upon that data source’s likely predictive value to that use case (we use a range of 1 to 5 in Figure 14). While this process is highly subjective, it’s surprising how accurate the business stakeholders will be in judging what data sources are likely to be the most relevant (see the final result in Figure 8). ![]() Figure 8: Ranking Data Sources vis-à-vis Use Cases Step 5: Capture and Prioritize Analytic Use CasesStep 5 brainstorms the decisions necessary to support the targeted business initiative, groups the decisions into similar clusters (use cases), and then prioritizes the use cases based upon business value and implementation feasibility over the next 12 to 18 months. The decisions are gathered via a series of interviews and facilitated brainstorming sessions with the business stakeholders and constituents (see Figure 9). ![]() Figure 9: Brainstorm Decisions by Stakeholder or Key Constituent NOTE: During the facilitated brainstorming sessions, it is critical to remember facilitation rule #1: All ideas are worthy of consideration! Next via a facilitated group exercise with the key business stakeholders and constituents, the decisions are grouped together in similar subject areas (see Figure 10). ![]() Figure 10: Group Decisions into Common Subject Areas NOTE: During this facilitated grouping exercise, there will be much discussion to clarify the decisions and the grouping of those decisions into similar use cases. Capture these conversations, as the insights from these conversations might be instrumental in the data science execution process. Finally, you want to prioritize use cases (on axis of business value and implementation feasibility) to create an Analytic Use Case Roadmap (see Figure 11). ![]() Figure 11: Prioritize Analytics Use Cases NOTE: During the prioritization process, there will again be much discussion about why certain use cases are positioned vis-à-vis other user cases from both a business value and implementation feasibility perspective. Capture these conversations as they might yield critical insights that impact the ultimate funding of the data science project. Step 6: Identify Potential Analytic ScoresStep 7 focuses on grouping variables and metrics into similar clusters that the data science team can then explore as the basis for creating analytic “scores” or recommendations. Scores are analytic models comprised of a variety of weighted variables that can be used to support key operational decisions. Maybe the most familiar score is the FICO score, which combines a variety of weighted metrics about a loan applicant’s financial and credit history in order to create a single value, or score, that lenders use to determine a borrower’s likelihood to repay a loan. For our example, we can start to see two groupings of variables around two potential scores: “Local Economic Potential” and “Local Vitality” (see Figure 12). ![]() Figure 12: Grouping Variables into Potential Scores Scores are critical components in the “Thinking Like a Data Scientist” process. They are the collaboration point between the business stakeholders and the data science team in developing analytics to support the decisions and the key business initiative. Scores support the key operational decisions that the business stakeholders making in support of our targeted business initiative. Step 7: Identify RecommendationsStep 8 ties everything together: the scores that support the recommendations to the key operational decisions that support our business initiative. The worksheet in Step 8 is best created in collaboration with business stakeholders (who understand the decisions and can envision the potential recommendations) and the data science team (who understand how to convert the scores into analytic models). See Figure 13. ![]() Figure 13: Linking Decisions to Recommendations to Analytic Scores Thinking like a Data Scientist SummaryI expect that this process will continue to evolve as we execute more data science projects and collaborate with the business stakeholders to ensure that the data and the data science work is delivering quantifiable and measurable business value. As they famously say: Watch this space! Additional Sources:“Thinking like a Data Scientist Part I: Understanding Where to Start” “Thinking like a Data Scientist Part II: Predicting Business Performance” “Thinking like a Data Scientist Part III: The Role of Scores” “Thinking like a Data Scientist Part IV: Attitudinal Approach” “The ‘Thinking’ Part of “Thinking like a Data Scientist” “Data Science: Identifying Variables that Might Be Better Predictors” The post Refined Thinking like a Data Scientist Series appeared first on InFocus Blog | Dell EMC Services. |
