Data Product Topic & Terminology Cheat Sheet
Summary of Product Topics
Software digitizes activities. LinkedIn digitizes resumes and puts them online.
Platforms create access to a collection of tools and features to facilitate related activities. Platforms also enable user interaction with multiple software solutions and each other. LinkedIn's endorsement, InMail, and Connections are all features of its platform, which includes the tools to put your resume online. Youtube advertising, comments, and ratings are features of its online publishing platform.
Over time, the collective user engagement upon the platform creates a value moat that prevents competitors from offering the same value. You can find another resume digitizer, but you can't bring all your connections and messages with you. Done well, the user interactions themselves are nearly zero cost to operate as the platform provider but are of high value to users of the platform.
Summary Of Data & Tech Topics
A company's Data Center is an internal platform that allows the company to create external facing software, and eventually, a platform. The Data Center stores source code that contains the business logic that digitizes processes. Data Infrastructure is a set of shared technology components that delivers the data that the software processes need for their execution. Examples are warehouses and data automation tools. Designed and used well by a data literate organization, data infrastructure prevents data silos from forming.
Data silos can sometimes be created by vendors to create another common type of moat, a data moat. It is usually a tactic adopted by software vendors, not platforms. Vendors who create these intentional data silos often refuse to allow customers to extract their own data to a new software solution, or charge discouragingly high fees to programmatically access data. These types of moats are becoming increasingly less defensible because of legislation and improved customer data capabilities. These more data literate customers, often companies, take on the responsibility of liberating their own data from vendors and bringing it into their data centers to use for the benefit of their business and customers. This group of work is called ETL (extract, transform, load), and sequences of related ETL tasks are data pipelines.
Better Architecture improves the design of the platform as a growing number of software solutions and ETL pipelines must coordinate with each other to add value. For example, LinkedIn's InMail needs to be able to look up the actual email of the user collected from the online resume to send email notifications. Together, the data architecture and infrastructure creates a place where developers can merge and remix all the data collected by the company to make solutions smarter, more customized, and interactive. Over time, many platforms invite their users to create their own custom solutions by exposing certain data feeds and APIs, further adding value. This creates an app store, a sign of extremely mature platforms like Salesforce, Facebook, Google, and Apple.
Platforms, Software, and User Engagement To Create Value Moats
Companies create platforms. Platforms enable access to a set of tools and features that allow users come together to perform a set of interrelated activities: create solutions, interact with each other, and share. Platforms can be internal or external.
A very common internal platform is a data center. A data center creates a place for developers to create software. The platform also creates access for customers to engage with the software. Other platform tools can be things like single sign on, subscription management, revenue collection. These platform features all make it easier for developers to release software and benefit the company without having to implement the same logic over and over again in a silo.
Another common internal platform is a company wiki. The company wiki has a set of tools that allow people to share their knowledge and interact with each other's documents.
External platforms are things like LinkedIn, Facebook, and Youtube, where users come to do a variety of activities related to professional networking, friend networking, and content.
Both internal and external platforms enable multiple people to come together to create, distribute, and access content and solutions.
It's easy to confuse platforms and software. Software uses technology to digitize activities. Viewed as software, LinkedIn is a software that digitizes resumes. But LinkedIn viewed as a platform now includes the following less obvious software and features: endorsements, InMail, SalesNavigator, data feeds and APIs for other developers to create LinkedIn apps, LinkedIn Articles, and advertisements.
All of these tools and features are only available because 1) LinkedIn created a core piece of software that collects resumes and 2) LinkedIn expanded beyond defining itself as a single software solution to an online platform that facilitates all professional networking activities. And those activities are fueled by the resume data and the network effects that resume sharing and professional networking creates. Everyone was pulled into LinkedIn to either share their own resume or access someone else's resume. But once you are on there, the interconnectedness that your profile has to other people keeps you on LinkedIn. It's very easy to find another software solution to digitize your resume. It's very hard to switch off LinkedIn because your profile on that platform now includes all your interactions with other users.
R&D investments that create services that can deliver on-demand packets of information via the web (APIs), internal data pipelines (data engineering) and continuous integration and deployment of new features and experiments (DevOps) all speed up the rate at which new features can be developed, released, and integrated across the platform. Internally, this enables faster and cheaper R&D. Externally, this enables a seamless user experience across all tools within a platform. Architects and data engineering best practices can help make sure that design and development activities in the data center and APIs today improve tomorrow's R&D and feature set.
Platform features all fall into one or more of these categories: core application, user engagement, revenue generation, or value adding for the user with very little R&D investment by the platform owner. All of these should create value for the community on the platform.