I have broken down below a simple six part outline of what a technology platform organization is and how to run it. The platform organization is the bottom of the technology stack. The top of the stack will differ from the platform, in that, it will have external customers and more focus on product design and the user interface. However, this outline can still largely apply. This plan is built up over 20 years of working in an around technology organizations. You will find my ideas on Open Source First is the rallying cry of how to run a better platform organization. All the steps that follow have the Open Source First behavior mixed in. Coming from my Open Source First post, transparency is the key thread throughout this outline.
1. What Are You?
A platform organization can be defined as all the underlying functions required to build and / or support customer facing software applications. That includes the data center, server, network, load balancer, firewall, storage, datastore, service bus, service registry, data search, data analysis, data visualization, application server services, monitoring, and alerting (did I miss anything?) in all their various forms and sizes. All that follows is how to run a platform organization with transparency, not what technology to utilize.
2. Know What You Support
Each platform organization supports various services. I loosely call services as something that is listening for requests from a client that the customer is using. There are a variety of service support levels. A Network Operations Center (NOC) is a form of a service. The NOC is listening and waiting for requests. A Service Bus that is implemented at each data center and supported by a centralized team is a service. Each implementation of the Service Bus is listening for and relaying messages.
The software developer creates a software product. A product can utilize services as outlined above. A service can be defined as a product as well. A software product is versioned with specific features and bugs. A product has customers with expectations. A service that is also a product will have similar customer expectations. The critical attribute of a product is the customer. A product has a customer. For example, when you acquire a product and your expectation is that it will continue to function as is for a year. When the product fails after a month, you expect to have it replaced or be compensated. A product has transparently communicated iterations of increasing functionality or also called versions. Each product version has advertised, expected functionality. Product bugs happen, but are expected to be corrected in a reasonable period of time.
So the point here is from a platform leadership perspective, you always have customers. Therefore you manage products for your customers. You have services as part of your product(s) functionality. If you were to only focus on the services, then you miss the customer aspect of your responsibilities. You have products that you stand behind and guarantee of level of service delivery to your customers.
Now Define What Products You Support
Make their versions, service levels, end of life, outstanding customer requests, support, documentation, current features, and features to be implemented transparent to everyone. Being transparent means starting with everything being public information.
3. Know Your Customers
So often, we rush through designing and implementing a great product that has little to no customer interest. You must understand and personally know your customers. Schedule recurring meetings with your customers and key product managers. Hold those meetings on time no matter what. Delivering bad news in person is just as important as good news. Remember that the platform organization exists to serve its customers. And that your platform customers are the people that make the profit in the P&L.
4. Define your Roles
Every organization needs well defined roles with responsibilities. I have used the RACI model to help describe the roles. I have outlined below the most critical roles for a platform organization with a few of their most important responsibilities.
Image 1: Platform Organization Hierarchy
In my definition of the platform organization, everyone reports up through the Platform SVP. There are four Development VP positions for Services, PaaS, IaaS, and Hardware, one Product VP position, and one Operations VP position. Six VPs report to the SVP. The Network Operations Center with the first and second level operations engineers report up through the Operations VP. The Development VP positions oversee their slice of the platform technology and are accountable for the third level operations engineering. The Product VP oversees the Product Managers, Scrum Masters, and Release Managers. The Product Managers report directly into the Product VP, while the Scrum Masters and Release Managers for each technology slice can be dotted line to the Product VP. This allows the Scrum Masters and the Release Masters to be closer to the developers that they support. As long as there is accountability, it should not matter who reports direct to whom.
Each and every person in the organization is expected to be involved in development and operations practices through CI / CD. That means everyone practices DevOps and the resulting agile behaviors.
Platform SVP role
Responsible for the annual strategy derived from the CTO strategy
Accountable for leading the organization
Accountable for hiring practices of the organization
Responsible for the organization budget
Accountable for the platform products
Accountable for on-boarding new customers through platform operations
Recognize the the buck stops here with the SVP. The SVP delegates responsibility to the chain of command.
Development VP role
Responsible for the annual strategy for their slice of the technology organization
Accountable for quarterly objectives
Responsible for leading their slice of the organization
Accountable for hiring in their slice of the organization
Accountable for their part of the organization budget
Responsible for their technology slice platform products
Accountable for layer three operations engineering support
Development Manager role
Very similar to the Development VP role, except they are responsible for the capabilities of the VP in their slice of the organization
Product VP role
Accountable for quarterly objectives
Responsible for the Platform Product Managers and Scrum Masters
Accountable for regular meetings to determine the status of the release schedule
Accountable for planning and organizing the product roadmaps, releases, and reporting
Responsible for the Product Management retrospective and Product Management roadmap review scoring processes
Accountable for the platform products schedule
Responsible for coordinating product release schedules and product milestones across the entire organization
Consults with Security, Operations, Open Source, Legal, and other horizontal teams on requirements and assistance
Product Manager role
Very similar to the Product VP role, except they are responsible for the capabilities of the VP in their slice of the organization
Responsible for prioritizing the Product Roadmap(s) features while working directly with the Scrum Master and Development Manager
Responsible for product backlog scoring used during scrum retrospectives
Collaborates with other Product Managers on Product Roadmap risks and dependencies
Responsible for demonstrations, roadshows, and product showcasing
Accountable for engaging with customers through Customer Advocacy meetings
Responsible for marketing the product for customer adoption and on-boarding new customers
Responsible for accepting or rejecting product delivery from the development team for a release based on product reviews and quality
Accountable for quarterly objectives
Accountable for general operations support (layer one and two) including the Network Operations Center
Responsible for managing alerting and monitoring services
Responsible for managing the CI / CD infrastructure
Responsible for on-boarding new customers into the platform organization
Release Manager role
Responsible for their own personal quarterly objectives
Accountable for software pipeline quality
Responsible for tracking unit and build test application. Are the tests doing anything useful? Do the tests get set to noop during the push for release?
Responsible to work with the Development Manager(s) on how testing can be improved. Are the same tests being run for gate and build? Can some of the tests be run pre patch submission by the developer? Are there some project teams that are having problems with test implementation?
Responsible to work with developers and/or the Infrastructure Manager reviewing software pipeline logs for information and errors.
Responsible to work with the Infrastructure Manager on maintaining the software pipeline. Especial focus on keeping the software pipeline functional towards the end of a product release cycle when there will be heavier than usual load.
Responsible to work with the Product Manager and Development Manager on the product release cycle.
Scrum Master role
Responsible for their own personal quarterly objectives
Accountable for managing scrum or kanban boards for measuring progress against the roadmap
Accountable for holding scrum or project retrospectives
Responsible for working closely with the Product Manager and Development Manager on backlog prioritization
Responsible for working with the Product Manager on the product release
Accountable for managing appropriate engineers’ time management during sprints
Consulted by the Development Manager on feedback around engineers’ performance, productivity, and quality
Responsible for identifying and removing risks and dependencies in coordination with the Product Manager
Responsible for their own personal quarterly objectives
Accountable for completing work assigned by the Scrum Master
Responsible working within the DevOps software pipeline(s)
Accountable for personal technical capabilities
Responsible for mentoring junior engineers
Responsible for practicing transparency and collaboration
5. Communicate on a Well Defined Schedule
Communication milestones that your customers can come expect is critical to gaining trust. Most importantly these milestones provide transparency to the product development process for customers and collaborators. If delivering on these milestones becomes difficult, consider moving the Scrum Master and Release Manager roles under the Product VP for more accountability for the product management process.
Bi-weekly communication to the platform organization
Progress on features
More in-depth information from blog posts
Monthly communication to the company
New product updates
Few blog posts to highlight
Weekly to bi-weekly project retrospectives
Backlog progress scoring based on backlog reviews by the Product Manager
Project epics are created and updated by the Development Manager, Product Manager, and Scrum Master. Then the epics are kept up to date throughout the quarter
Monthly Product Roadmap reviews
Each Product Manager updates their published roadmap monthly
Each Product Manager publishes a Product Roadmap quality score for each product they are responsible for. The roadmap quality score is based on: are all the roadmap details available, is the roadmap published on-time, and the quality of the roadmap details.
Monthly updates on product, project status based on Product Roadmaps and Releases
Report published with last quarter product releases, scoring metrics, progress on features, status of risks, dependencies, and the status of customer requests
Updated, published annual product release schedule
Quarterly product roadmap reviews
Features, bugs, risks, and dependencies
References to epics for cross-project discussions
Quarterly Headcount and Finance review
Travel, equipment, events, sponsorships
Headcount adjustments by project, product, and roles
Fine tuning from the annual review
Quarterly Customer Advocacy meetings
Each functional group of products holds quarterly Customer Advocacy meetings
These customer meetings can happen as often as required, but with the Product Manager(s) attending, representing the product
6. Create a Culture Based on Transparency
Every organization needs a strategy. But very few organizations have a strategy that makes sense to the organization. That is because most people start with blue sky planning. That is a mistake. This is not a company you are running, rather a team that supports an existing company with a strategy and goals of their own.
After putting the platform organization together around products and customers, you have a solid baseline for what your strategy and goals need to be. Before you would have wasted your time. Now you can be clear and spend the minimum time planning.
Using that baseline, along with the CTO strategy mixed in, the Platform SVP maintains no more than 5 annual goals, created two months before the new year starts. Those annual goals are the strategy for the organization. The strategy must be clear and timely, so everyone can reference their part in delivering the strategy. The platform leadership, Platform VPs, will need use the annual goals to plan out their annual year products and headcount.
To publish and communicate the strategy, use the method of OKRs. My best OKR experience was when each employee started with a blank gdoc indexed to the organization structure. Transparency of everyone’s goals, starting with senior leadership, builds organizational trust and confidence. OKRs can be a key tool for organization change. Transparency of leadership’s goals is an important aspect of open source behavior derived from Open Source First. Once the Platform SVP publishes the annual goals as OKRs, everyone can read, write, discuss, and debate the annual strategy.
The organization then updates their OKRs quarterly. Senior leadership should take no longer than a week to create and publish their OKRs. Senior leadership and the rest of the organization publishes their OKRs for the next quarter 2/3’s of the way through the previous quarter. Take no more than a couple of weeks following the first round of OKR publishing, to debate and revise any major inconsistencies. That means timing wise, the whole organization will have their OKRs completed for the next quarter, weeks before that quarter starts.
Do not be tempted to create a hierarchy of OKRs from leadership on down. I have never seen it work well. If your leadership understands your products and customers, then their goals will be very similar to the rest of the organization. Senior leadership cannot understand all the details of the working parts of the organization. Additionally, if everyone waits for the leadership goals, before starting their own, it will cause delays of weeks to months. It is better to get 80% accuracy in your quarterly goals while publishing them on time. Think quarterly OKR train release.
Points to highlight:
Each person maintains 3-5 OKRs each quarter. OKRs should be their priorities only.
OKRs are not meant to be a project or product management system
Transparency of everyone’s goals to improve collaboration
Each manager holds their directs responsible for their OKRs. Use your directs OKRs as part of their leadership mentoring.
Each person rates the success of their OKRs. 60-80% OKR success rate is what you want. You can also call this stretch goals.
Let Transparency Take Hold
In conclusion, take this outline as just that, the broad strokes. This outline is focused on the process steps that can allow development teams to independently create their own way of running their teams within the platform organization. When you have a well structured organization with milestones, transparency, and good communication, it encourages merit based work from everyone. And that is a place, I want to work at.
There are consistent truths that cannot be ignored. Speed of light is one. There are others below.
Cloud means distributed systems connected by networks. A vast majority of cloud implementations are really rebranded platform services with their infrastructure in a fluffy icon. So I’m going to replace cloud with platform for the rest of this post.
Platform services exist to support your front end. Period. End of story. That means platform in most cases, is the L in your P&L ( Profit and Loss.) The L must be smaller than the P. I’ll write a follow up post on economics for those that disagree with me.
What make money leads the direction of the services required. So a solid, well documented, consistent feedback loop with the platform customers is a sign of a healthy organization. There are some cases of planning on what your front end services need before they ask, like CI/CD or adding new features to existing products, but a majority of what platform does is respond the front end services.
All your services need to be deployable by code rather than by hand. There are many options to accomplish this with no silver bullets yet. Your organization will have its own flavor. Those skills and knowledge must be automated for maximum rate of success. Even those few organizations that rely solely on public infrastructure still have many processes around managing all your platform services.
Your engineering culture must serve your DevOps needs. Meaning do everything within reason to keep your engineering staff happy, healthy, and productive. Listen to their needs. Snacks, activities, and nice chairs are a start. Root access to their laptops, IT kiosks staffed with friendly, available staff, and a culture of mentoring new skills is much, much better.
Your compute physical infrastructure must be network wise close to your data. Latency and throughput can be mitigated, but not ignored.
Public infrastructure does not solve all platform problems. It only makes someone else responsible for them. Do you trust another company exclusively with your future? I didn’t think so. So even if you truly believe public infrastructure all the way, you still need an alternative option. The complexity just gets moved around.
So for those of us that still need physical infrastructure, it is always going to be your largest and longest term cost. Retain the best and the brightest people for where, when, and how to build your physical infrastructure. For example, what are the tax implications of building in Singapore versus Hong Kong, data privacy laws in Switzerland, or what’s the electricity availability In San Francisco? The right people can help you to avoid making long term, expensive mistakes.
The time and people required for acquiring, installing, configuring, and testing hardware and software is a constant. Public infrastructure or containers doesn’t make these things disappear, it just moves the problem to another system. Constants must be dealt with a robust capacity planning team. These are the people that are in the middle of your customer feedback loop and know your operations and physical infrastructure developers very well.
Operations that doesn’t thoroughly understand their workloads will suffer major outages. It’s not if, but when. The best way to understand your workloads is by exhaustive research and development. Everything breaks. Figure out how your hardware and software breaks so you can mitigate the pending failures. Software pipelines, Performance Engineering, Quality Assurance, and a culture of R&D as a practice for all of your engineers will get you most of the way there.