Process Governance, Information Governance, Asset Governance, Data Governance, IT Governance – if you are working within a medium-to-large enterprise, there is a good chance you’ve heard of the term ‘governance’. Read on to learn more about the types that exist, and some conceptual details about each.
Common Governance Types (or “domains” that benefit from Governance)
Governance approaches and controls can be applied to almost any entity or subject that the organization has a practical reason to put control mechanisms or processes in place – whether it is for Regulatory Compliance or other operational benefits.
Starting at the highest level for a private organization (for public organization, it would be the the national Government, for example) is Corporate Governance – which can be defined as:
The operating model with rules and practices by which a board of directors ensures accountability, fairness, and transparency in a company’s relationship with all its stakeholders (financiers, customers, management, employees, government, and the community).(many definitions exist – for reference)
Corporate governance is often a comprehensive look at both structures and relationships that determine the direction of the corporate entity. In this perspective, the shareholders and the management are the primary participants, other than the board of directors – which is typically a central role, at this governance level. Employees of the company also get a seat in many corporate governance settings, as well as representative parties of customers, suppliers, and creditors – working within the constraints of various legal/regulatory and ethical/institutional rules and regulations that may supersede whatever the manner of governance the corporation adopts.
Providing a bit more ‘conservative’ perspective on the practice of putting in measures to mitigate the risk of incorrect or outright wrong information – is the level of information governance. There are multiple functions that need to be managed in how technologies are utilized, and typically as a partnership with Information Security organizations, programs are put in place to indicate:
- What information is retained
- Where it is stored
- How long it is retained
- Who has access (and what sort of access) to it
- How that data is protected
- How policies, standards, and regulations provide assurance
The challenge many organizations face is connecting these programs under one umbrella and correctly assigning ownership – sometimes to legal, sometimes to IT, and sometimes to compliance. Each organization is different, but in general, the following diagram describes the strategic vantage point the governance program can take:
Operating information governance models may differ in structure or ordinal – but stakeholder perspectives hold true almost all the time. Partially due to the recent decades’ explosion of data volumes and the subsequent regulations and compliance-issues increases, traditional ‘records management’ capabilities failed to keep pace – requiring a more descriptive maturity model. This is due to the need for organizations to deal with many different standards and laws that apply to information handling, such as:
- The Computer Misuse Act of 1990
- The Data Protection Act of 1998
- The Freedom of Information Act of 2000
- The Privacy and Electronic Communication Regulations of 2003
- Payment Card Industry Data Security Standard (PCI)
- Health Insurance Portability and Accountability Act (HIPAA)
- General Data Protection Regulation (GDPR)
- California Consumer Privacy Act (CCPA)
- and more…
As information resources are effectively supporting the business goals, the organization can accomplish its strategic goals more efficiently – because information governance should not be just sponsored by executive leadership, but be led at the enterprise level. However, there are other moving blocks, as well.
The difference between information and data may not be as clear-cut as software and hardware assets. After all, information governance outlines responsibility and decision-making accountability – while data governance is focused on the management of unprocessed information (data) at the business-unit level, typically:
- Availability (scope/delivery)
- Usability (structure/semantic)
- Integrity (referential/consistency)
- Security (access/retention)
With the need for business intelligence, data governance has become a priority in many organizations to be able to produce reports to meet the regulatory needs. Irrespective of compliance needs (similar to Information Governance compliance needs) at the data level, organizations of medium-to-large sizes inevitably recognize that cross-functioinal tasks can no longer be implemented efficiently.
Notably, technical capabilities are more concentrated with fewer professionals in the industry – as technology advances have allowed for various levels of Information Technology (IT) to converge. Similar to software solutions engineering practice, governance at the data level now requires tactical deployment to be delivered to provide quick ‘wins’ and avert organizational fatigue from a larger, more monolithic exercise.
Most businesses need to comply with a myriad of regulations, often within their operating nations – sometimes with more than a few countries globally.
Privacy, security – you name it, there is a reason for some legal regulation to exist and for the business to comply. In many cases those rules dictate where and how data generated within each geographical divide can be collected, stored and used.
Is it Required?
Depending on the business model. Most places that have a wide scope or has privacy-scope information usually have some requirements. If there is a person/team that handles compliance in the company, that’s the first contact to find out how much work would be needed.
What happens if we fail to Comply?
Again, the effect of a regulatory compliance breach depends on the business model and existing (of course) regulations in the business scope. It can range from some nominal penalties and warnings (with a promise to do better next year) to blocking a business operation or even more severe.
1. Participate in an audit of vulnerabilities
Are you given instructions from the Compliance team/person in the business? Follow those first. But go in with an understanding of what’s really being achieved, shared among the participants of the process. Otherwise, it can feel like it’s just something that HAS TO BE DONE without a real benefit. In reality, it’s a few people’s jobs and/or a risk-management scenario – which is always a good thing. If you feel angry about it, talk to people about why, so a constructive dialogue can be had – not just boil in the background about how it’s ridiculous, etc..
1a. Find out the big picture scope of work
Some participants can focus on smaller scopes of regulatory compliance. Many techies have rather large scopes to know – and to comply with. If a C-level needs to step in to properly align priorities and schedules, then so be it. It’s about time they got their feet dirty 🙂
1b. Compile a list of what falls in that “big” scope
If you’re DevOps, this means resources. If you’re a developer, then it’s the applications that are potentially exposed to. For an integrator-level, this means almost everything. 🙂 But use a bucket approach or otherwise categorize the risk factors – so you don’t take up (or even think about) too much.
1c. Compile a backup list of what should NOT be in that scope
- DevOps: anything beyond code-branching and pathways to a build, deployment to live environment, and feedback system might be recommended not to be in your scope
- Developer: from work-stream to code-merging make sense to be in the scope for this role. talk to the compliance team regarding any specific concerns outside the initial scope.
- Integrator: start by identifying points throughout system that are (or may be) of compliance (data privacy and security, etc.) concerns. work with compliance team as SPOC for tech.
The actual work needs to be the focus, not what ‘should’ or ‘should not be’ – meaning team should not follow dogma, but the utility of the compliance rule, that it can meaningfully report via technological means.
2. Plan how to maximize information
The most frustrating part of the audit process is not the volume of the questions, most of the time. As one might expect, volume of questions are linearly correlated with complexities of a business and/or scope of the audit. There is almost no way to circumvent this – other than to delegate in a costly manner. But there are ways to plan around whatever reality one needs to deal with.
2a. Agree on the data points truly necessary
Not to knock on anybody, but there are times when a compliance officer, either internal or not, who is eager to make an exhaustive list of data needed to be compliant. Whether driven by rigor or a desire to make a name for oneself, while such effort is commendable to a degree, if it serves zero value outside the compliance audit scope, there is probably room for reduction in the list. The important thing is to be compliant, which by definition is driven by rules and appended by interpretation.
2b. Reduce repetitive processes
Being asked the same question over and over? Maybe you forgot to agree on what parts of the audit process might be duplicitous. There are ways to achieve checks and balances during an audit, and some of those include confirmation of a collected information. If there are reports of numerous channels that are regurgitating sane information, it would be prudent to curtail that flow with a properly scoped hold & review of the process at issue.
The easiest way to prevent sucha situation is to create a cross-functional map of audit process vs. necessary information – but expects to see more compliance audits are about checkpoints than planning.
2c. Allocate resources efficiently
Flow charts are often used in designing or describing a process. It’s really a personal vice – that I prefer visual diagrams over literal outlines. So much so that I find myself hoping inputs to be in visual formats, as well. I do realize this to be a nasty habit that eats up productivity at start, but I’d argue it also makes communications thereafter about the process a lot simpler.
And if the audience finds the diagram too straight-forward? A job well done, I say.
Granted, there are depths to visual presentation that calls for rigor – not just a high-level sketch. Variety of notations and mark-ups are always a challenge to fit more than a couple audiences. (Some want UML, others ERD, BPMN, G/S, the list goes on)
As the title notes, I am looking at data flow diagram this time, which are used to describe the path of data that will traverse a scoped process. Many wikis would indicate the Gane-Sarson notation to be the most clarifying form,and while I do not necessarily disagree, I do have an experience where it does not do its job.
When your co-workers do not know it, it’s a poor choice – no matter how one might see its appropriate use.
If the team is more familiar with a basic flow chart with no more than control-flow shapes, then it’s a waste of productivity for each of them to learn a new notation. Rather, the creator of the diagram delivering a cross-functional chart at multiple depth levels can deliver the type of clarity we all hope for.
Unless I am consulting for a client who wants a specific notation, what is the value of sticking to a version that will take longer for any team-member to understand?
While no one likes working with negative productivity, sometimes we will find ourselves not contributing in a constructive manner but detracting to the overall productivity. This can/will be discouraging. Becoming sour at oneself or any others in the team in whom one recognizes as a negative contributor is just adding to the madness, however. I feel the best way to get over it is to take a closer look at what happened, to prevent recurrence, as much as possible.
What is negative work?
The idea of negative work is not readily acceptable. Someone can deliver zero productivity by ‘not working’ at all. How does negative work happen?
Let’s go through an example. An awful team-member that once worked with me. The person made roughly 3 significant changes to the work process and output over his 7-month time there. The result first: first 2 changes simply did not improve any part of team-work and broke other processes and capabilities. The 3rd change was to revert everything back, explicitly.
So adding a couple, then subtracting to revert the previous 2 may look like it’s zero productivity. The point is that that multiple people had to also follow his mistakes until it was resolved. Don’t take me wrong – the person was not of ‘junior’ caliber. I would not be fair to expect net-positive productivity from the get-go, especially for a person who needs mentoring and leadership. Experience was there, as well as the ability to explain his work – the person also had the title ‘senior’. Others not only had to track down why something wasn’t going as planned, but had to figure out whether the issues they were seeing were the result of their work or someone else’s. Since it’s not often that the person with the mistaken sense of correctness is up to fixing whatever is supposedly done wrong, the whole team had to discuss fixing the issue. The team had to collectively determine that none of the 2 changes were worth fixing and that the team was better off reverting those changes.
Across the entire team, the result was that a lot of hours were wasted. The one member did zero net-work and also reduced others’ productivity. Again, I’m not saying this is a special case or that it is inconceivable.
The other kinds…
There are also many workers who can make changes that work, but end up with a large amount of confusion or lack of clarity – that others have to spend a larger amount of time trying to understand how & why those changes are necessary. To be sure, understanding why a change is needed and how the changed process or item leads to improvement is not always necessary – and often outside scope of many mid-to-entry level workers. And for those whose work-scope includes such an understanding, it can be time-consuming, regardless of the quality of work. But the degree of difficulty in understanding something does matter.
Yet another sort of productivity killer is sticking to workflows that are eating up time. I once worked at a different company where we were trying to decide on using a new way to integrate multiple developer’s work-product. The reason: the method we were using was taking hours of whole-team discussion, when only subsets of the team were needed for many of the integration sessions. This was before modern integration practices (such as continuous integration, devOps, etc.) proliferated, so we did this twice a week. Most of the team were convinced using the new method would have brought it down to under an hour. It did not go well. It took the team 6 months to put the new method in place, because one influential team-member didn’t want to change the existing process and was able to veto major changes.
3 hours x 2 times a week x 6 months x 4 weeks a month = 144 hours
1 hour x 2 times a week x 6 months x 4 weeks a month = 48 hours
Essentially, 96 hours were wasted. But that is just for each member of the team. Multiplying 96 for each member of the team would show even more egregious waste of time that could have been spent on other items. (one excellent reason why things like BPR – business process re-engineering – is a clear need in some situations)
Negative work –> Low Morale
Most people work to feel a sense of accomplishment, aside from making their ends meet. They want to feel like their time was spent on something worthwhile. That means delivering results that bring value to the business. Wasted time prevents that.
Most of us also want to work with talented people. Working with someone who is a burden on the team is an emotional drain. Not only does that one person act as an obstacle to achieving success for a given item, the influence of such a person can also make us feel we bring less value as a team.
The job market is one way to solve this problem: find another job. This is obviously an undesired outcome for a company. When one team-member leaves, there is bound to be an emotional instability in the rest of the team.
Entrance of the negative worker
How do such people get hired, when it is apparent that the cost of a negative worker is so high? An interview process sorely in need of improvement can be blamed for this. But succumbing to the temptation to lower standards or considering attitude less important than aptitude is most likely the less talked about part.
Sometimes a lot of work needs to be done very quickly. If there are not enough human resources in the company to accomplish that goal, then more need to be hired – either in-house or external contractual. If the job market is strong for workers, it will take longer to make a good hire – and this is when the temptation to lower hiring standards is strong. When there is a lot of work, some management start hiring in panic-mode. They think having more bodies in the project can only contribute to more work getting done.
While such a mentality may have been valid in the traditional/manual manufacturing era, it is now far from the truth. The informational and technological capability of this era ensures that quality matters more than quantity, when it comes to productivity. 10 positive, well-communicating contributors may actually bring more value to the workplace than 100 of the mix.
Desperate hiring will not solve the problem of tight timelines in projects. There is a good chance that it will make it worse. Negative workers will not only slow the team down, but they can cause your great producers to leave the company or team. Sadly, the project will now be even further behind schedule.
The flip side – a manager
From management perspective, a way to get a sense of productivity is to measure the decrease in administrative workload. More often than not, if management tasks take less effort to carry out, with same or more level in project results, that means someone (or something) is helping to do more by allowing the management to do less. One common mistake is to recognise ‘not having to know’ as a way to determine whether productivity is increasing.
Management is often related to delegating work. In a Forbes article about increasing productivity through delegation there is even a mention of a book, “Work Less, Do More: The 14-day Productivity Makeover”. Single 14-day period is by no means a guaranteed time to overhaul any environment, but a basic goal is required. A good starting point is to understand how to delegate.
Delegating work is not a ‘you know what to do, so I do not need to’ approach. Rather, it is closer to a ‘set it and forget it’ approach. One needs to:
- Choose the tasks to delegate
- Pick the person to delegate to
- Give clear assignments and expectations
- Set a date and ways to follow progress
- Delegate responsibility and authority, not just the task
A couple key factors in a management style that fosters team-building are:
- Trusting those who you delegate work to
- Giving credit publicly
People who micro-manage a team into decreased productivity are doing so, because they mistakenly believe that close scrutiny is same thing as attention to detail. This failure to appreciate a team-member’s capability (potential or proven) is bi-directional, coming from lack of trust and fear of not getting credit for work done.
I recommend identifying characteristics in yourself first, that match the negative worker qualities. Try to reduce the occurrence of exhibiting them. Then determine how those efforts are paying off, all without involving anyone else. When you are confident about a set of practices, or others commend you, that may be the right time to share the journey.
I would try to take the holistic approach – contemplate as both a delegator and the delegated. Clarity in communication is always key, while committed responsibility is the path to walk.
I am interested in feedback and other ways of approach – for obvious reasons.
Here’s another point in software development culture, that is hard to get settled into an organization. To be fair, it’d be more accurate to say that instead of a culture being wrong, that it’s just different – between people and of course, organizations.
Culture is shared and similar thoughts and actions in a community. A person may be part of a community and be knowledgeable about the goals and traits of that community – yet have immense difficulty to follow along. Simply put, changing an individual’s habits can be achieved through personal agenda and strength of desire, but community culture is much more difficulty to affect changes unto.
Unsurprisingly, it’s hard to follow even a commonly known development culture. And the one I bring up today is the culture of ‘peer review’ of code.
Arguably, code-review is one of the most important culture in a development environment, but it’s hard to find it done well in many organizations. Many know and agree to its importance, but most try-and-stop and repeat many times.
Why the difficulty in doing reviews?
The most common response to that question is: no time. Understandable reality – teams have a tough time to even to implement features. To many, it is statistically true that doing code-reviews appropriately saves total time to implement as well as cost, but it also sounds like a pipe-dream – realizable only in an alternate universe.
Some companies enforce code-reviews, but it can also be executed to fulfill a process, instead of serving the real purpose – resulting in bad memories. A few bad runs, and the review process does not become culture but is either stopped or is kept as an ineffective action item.
In my opinion, the culture of code-review is not trivial enough to simply quit because it is tough.
There are many reasons for reviewing. Discerning errors in code, as well as shared knowledge and mutual education are just the first ones that pop into my head. Review has organization-level purpose of improving quality, but knowledge and info is shared through the process, as well as know-how and insights – leading to mutual growth of developers. The knowledge community can be enhanced through reviews. In fact, I would even argue that without code-reviews, the core capabilities of developers will be tough to improve.
DevOps, Data-collection, Data-Delivery, Analytics, BI, Compliance.
These are fine goals, which our company and every OpBrand in the group are trying to achieve.
We have IBM managing our IaaS, mostly. However, our business projects and their management is not. Going to work on changing that, for 2016 – H2.
Hopefully that helps in automating parts of project-management with our infra and compliance management, via integration. If there is an existing solution approach, I’m ears.
Suggesting a near-real-time data collection and delivery from multiple sources is currently under way – bubbling it up the group for buy-in and budget.
Analytics and BI is after that – R modeling and predictive analysis is the goal – not just for software development, but the rest of the business.
People sometimes ask me what the number one thing I wished did not happen in the tech development industry. It may be different some days, but often it is:
Due yesterday, needed it yesterday & variations
I have nothing against working hard or working fast. In fact, I have no doubt that is one of the primary drivers working for the software or tech industry. The capability to achieve same or similar amount of work in a shorter time is competition won. The manufacturing industry has showcased how this is true, for several decades. The Lean Process and many offshoots also guide us in how to try to replicate such success in the newer industry of virtual products & services.
Interestingly, the very application of such need-driver of “faster” and “deliver ASAP” seemingly can become a toxic element in the companies that try to harness the Agile and Lean processes – manifested as technical debt in common techie view. The prototype is built quickly, but the final delivered product takes much more and the quality suffers AND the maintenance and support effort needed becomes unexpectedly large – which leads to terminating the product support. Sometimes it can get as bad as leading to an organization’s demise.
Why and how does ASAP-culture become problematic in the software industry? The cause can be summarized into three points:
First, software is complex. Apologies to other industries that may object, but software engineering has become a most complex information industry field. Someone shouts ‘just start coding already’ and the prototype is started – and the architecture becomes a blurred mess of lines and the actual implementation is delayed as a result. One alternative of taking the time to create necessary documents and diagrams would be ideal, but if it’s followed true to its literal dogma, too many projects will not finish within 10 years. Optimal balance between the two is the most difficult to achieve.
Second, software engineering, at its epitome, is near art-form. Architecture in software is so much like creating an art-piece, that hurrying it along or pulling an over-nighter does not guarantee faster arrival or a better result. Not being given enough resources to deliver a near-optimal architecture and embarking on implementation often entails 100 or even 1000 times the resources to make modifications after a certain iterations – maybe even releases.
Third, software is like a living thing. There are several concepts brought in from building architectures and many are very similar. But the difference is that a building’s foundation or structural pillars does not change much for a century, while software architecture becomes more emergent after its release into the wild, being improved and re-worked as needs arise. General ball-park estimate is commonly at 4 times the amount of resources are needed to maintain a piece of software system than the amount to develop, after its release. But if the architecture is not flexible or modular enough, it may end up costing upwards of 10 times that (i.e. 40 times the amount taken to implement the original released version) – and still have difficulty keeping up maintenance or upgrades.
Then is it better to be thorough and develop slowly? Never.
Most global, commercially-successful softwares are not slow, monolithic solutions that have really long update cycles. The foremost utility of software development is to deliver market-ready software faster – or as is the curse and driver of need availability ASAP. Any factor that slows that goal, regardless of methodology, regulations, limitations or process can and should be scrutinized for validity.
In other words, the ASAP need or impact-driven culture to delivery of software is what actually slows down the implementation and eventual delivery of software, and the culture is what also induces other technical debt not apparent at release-time. Some technical debts that are by-products of such culture are:
- Lack or absence of project knowledge, insights and data sharing and collaborative strive to be at a state of tranparency to be at the same page, regarding such.
- Difficult-to-scale, incomplete architecture(s)
- Dependent on very few, key developer resource
- Numerous copies of duplicated code
- Lack of work-life balance and inability to increase capability individually and as a resource-pool
- Ever-increasing cost of maintenance
This sort of result can occur, regardless of development methodology or tools employed. SRS (software requirements specification) format of technical spec or using TDD to front-load automated tests as technical specification of features are both ways to develop software faster, by clearing away what is not a well-defined set of features and definitions of the ultimate desired need from the product. Documentation, code-review, process of development, etc. are all ways to help reduce learning curve on knowledge sharing – so doing away with those steps in hopes of getting the product delivery earlier is essentially impractical (and borderline mis-understanding of how to increase effectiveness of available development resources).
Of course, in order for such tactics to be effective in delivering software fast, a well-seasoned architect is needed, but the software development culture supported by the company, not just the project team, needs to be at a good, acceptable level of maturity.
Very few, if any, people would argue that the bricks must be stacked prior to actually designing the building. Foundation work and design of the architecture must first be agreed upon, to build a building as quickly as possible, without a heap of modifications and other related troubles, later in the building process. By the same token, software needs a some firmness with flexibility and modularity afforded in the architecture agreed upon by stakeholders, to respond to changes in the desired features and build-outs – not to mention facilitating collaborative effort expenditures to ultimately reduce both time and cost of development.
In fact, most developers would agree that limits on time creates many serious issues for arriving at such architecture(s). Even so, there is a distinct culprit factor for accelerating the onset of the culture to work in terms of “ASAP”: it’s the business management and the clients/end-users.
To most project management, short-term result is de-factor standard for survival. Owner-stakeholder of a project has a bit longer window on producing results, but many management have a 2 to 3-year contracts (if not less). Extension or conversion of contract into different terms can become horridly difficult if visible results are not shown in those times. Essentially, management that focuses on 6-month to a year’s results prioritizes marketing and top-line sales – it falls on others to think about architecture or maintenance costs.
If a company has a firm stakeholder role of a CTO, the CEO might not be able to demand that architectural flexibility or scalability be sacrificed for short-term impacts – but most situations do not have this safety-net for due-diligence.
The client attitude can also be problematic. Most end-users demand a faster turn-around on their feature-demands from a regional or in-house development teams, while expectations of a fix on a global-scale product usually hovers at around 6-months, and thankfully, at that. They know from previous experience that the global enterprise solution provider that they’ve worked with before does not release a fix any sooner than that.
The clients sometimes actually demand that the developer resources work on-location at the client premises. Instead of properly organizing specifications and getting used to developing software collaboratively via modern communication channels, clients often want to have the developers available on-demand and develop as need arises, to prototype as-fast-as-possible.
The business is used to the culture of impact-drive, client-flavored solution being the highest priority in external parties providing software solutions that are customized. Such experience is often times a barrier to scaling the software solution. The business may have achieved number-1 rating in limited regional scope with such approach, but they often face significant challenges scaling the same solution globally, or to global coverage needed for a client.
Having seen such smaller-scale solutions fail, when delivered as a result of an ineffective, ineffective culture-environment in the solution provider, one must wonder how different the expectation of an issue turnaround might be, by a client used to dealing with robust, global enterprise solutions. Not all, but many would be surprised when there is a response that indicated that a bug will be fixed in a day or two. This would be especially true, if for a critical system – possibly leading to a decrease in confidence-level about the solution provider.
Many smaller-scale development software companies fight through lack of clarity to promise a better fitting solution to the client – and the end-result is an overall shortage of capacity in real terms. The actual software developers end up having to pay for such lack of resources. Developers must quit being able to balance work-life times and end up burning the mid-night oil over longer periods of time, prior to the promised delivery date. There is no doubt the developer talent has no room to grow, in such a situation.
Culture change being difficult is simple-hard fact. It cannot be successful unless the client, the management and the developers all ascribe to the same change in the culture – and foster near-sameness in expectations in the process, as well as deliverable and timelines. It’s simply impractical to expect to change the world, as a single developer. But the first step is by communicating the need to change. Change must also be facilitated by starting with realization about the need to change. Someone must make the sacrifice of being the first one to change and convincing others to follow suit.
But if one developer is the only one that employs the vision of the changed culture and other associates and stakeholders do not embrace the culture change, the expectation would not have changed. Therefore, it’s clearly the highest priority for the business to realize how important architectural engineering is – and the first step can be to enhance the capability of the architectural stakeholders for the projects. The business must do its utmost to balance first-to-market impact-driver and the development/maintenance driver – so instead of just fast, properly-but-fast must be the solution perspective.
Foundational tools and frameworks, methodology and process need to be balanced. It also helps to look at maturing various other development lifecycle cultures and work-flows as a way to fix the effects of being short-termed fulfillment culture. There is no way except to start at least at a single point in the project deliverable.
It can also become a question about chicken-or-egg-first – but just recognizing that development culture maturity is actually an important, high-priority issue is in itself the right first step. Culture is not documentation or mandate – but something that must be lived-through experiences and learning from those with prior experience in changing the culture can significantly reduce the time vested in trial-and-error. It’s a way to improve continually, not just arriving at a path to stop progressing.
Waterfall or Agile or otherwise, that’s why it’s a method/ology, as opposed to a solid piece of scripts that are followed in developing software, for the modern world.
how to handle data at scale?
That’s the most pertinent question these days. Unfortunately, no single answer suffices. Here’s what we do at my workplace:
- identify available sources of data
- categorize by type, pricing and other metadata
- determine scope of data
- configure a virtual link to sources
- expose via web-API
- build custom APIs as needed
Many short-run or even skip this part. I feel that’s a mistake, unless there are only a few categories and/or sources. For those able, this is the best non-techie task for many techies – so why not start there?
Company or management usually has a wish-list. First filter that down to what’s available and not, and for what reason.
Another good one is cataloguing the method of data-source access.
categorize by metadata
How can the data be found or searched for? Numerous criteria for locating the data should be brainstormed, if not already available. Maybe it’s the flavor of social media that it comes from or related to. Contacts or other means to get access to it, should definitely be part of this.
If the data is publicly available, it should be included.
easiest, best case scenario: include public data API URI
determine what to include
everything – is not an option. Use the metadata to set priorities. The smaller or more divided the better.
select not more than a handful. How many depends on both the business and technology. Is there a set to make the aggregate meaningful? In what context?
tech might want to test threshold on something, given supposed high-volume of data
setup the virtual
finally we get to the actual implementation. For the first run, do the simplest, or maybe the smallest. Continue to add the source definitions and/or mappings, after at least the first one is exposed via a web service.
the sheer quantity of sources should take up some time to configure, depending on how organized the end-points need to be.
customize as needed
no first release candidate is going to be perfect. Different clients need different things. They may need it in as a SQL service, not a web service. Different representations to fit varying modes of consumption.
reap the benefits
this is so open, I can only say it depends on how the implementation went, as well as the allowed throughput from the sources.
there is another side, however. We need usage and performance metrics. How we will tell useless skeletons from data with some meat. How attractive one data gets over time, as well as lose its utility.