The Economies of Change

As described in earlier blog posts, much of my economic thinking and analysis is influenced by Schumpeter‘s economic theories. The process Schumpeter describes as “creative destruction“, being the driving force of change and sustainability of free market economies, can be seen throughout history.

Many of the institutional frameworks have entirely unraveled, casualties of the Great Recession, the effects far-reaching, some yet to come to full realization. Emerging from this downturn, new opportunities for entrepreneurs, the vital source of creative destruction, and their progressive ideas critical to economic change by innovating industries and frameworks that have become inefficient, continuing the cycle of growth.

Technology has driven our culture, how we work, communicate, and function on a daily basis, at the highest rates of change over the past 50 years. Every period of expansion have left industries behind, replaced by technological advancement, with out a doubt this expansion, as slow and bumpy as it has been, will be no different. This last recession has left many unanswered questions, and a belief that this current evolution underway needs to be a radical departure rather than picking up from where we left off. Decisions will be made on what industries and frameworks are worth carrying forward, and how adopt new technology to improve our decision-making, what products we purchase, who we want to do business with, social equality, employment market, education, and information transparency.

More to come.

The Hiatus

Per ipsum sit scientia

When I originally started this blog in 2010 my intent was for this blog was to provide a different perspective into data, analytics, and the technologies evolving around it. Its been over a year since I last posted, not that I didn’t have topics, I drafted over 4 different posts, but I found the topic to be either cluttered by discussion or worse I wasn’t really providing any further insight.

Why the renewal now? First my exposure to a variety of organizations and deciphering the data problems each faced, has yielded a breadth of experience, engaging with or applying a breadth of technologies, both old and new, has given me subject matter worth writing about, and on a personal note I was given a finite period to enjoy the company of a long-time companion who had provided her unconditional support to me, I owed her beach and fetch time.

Observations and lessons learned;

An unwanted side-effect to complexity is failure. “Plan for it” is discussed, even documented, more often then not, its poorly implemented. Designing distributed computing systems is not easy, if anything you’ve introduced new problems to manage, the process of availability and consistency shouldn’t be in the list of new problems. I was taught a long time ago if its not stable, it won’t scale, and then forget adaptable.

Code is now a commodity. Displaced by the consumerization of information, the emphasis shifiting to data re-use. Transportable, and assembled in most efficient technology, code is now this inter-changeable modal, no longer constraining the distribution or consumption of data.

The first month of the NBA/NHL?MLB season is irrelevant to most fans, as is TPC-H to anyone who actually analyzes data.

DataRefs is a more efficient join. Learned plenty with MongoDB

The cost/benefit that RStats + Python + Jruby + Amazon EMR has provided to a customer base is well immense.

Enterprise Software as it functions and behaves today, is a leading indicator that much of it requires a complete overall. Daily these on-premise applications continue to look tired, constraining organizations from growth.

SAP’s completeness of offering the most robust In-line analytical capabilities across line of business application offering, by far leads any other enterprise vendor. Driving efficiency into every part of the Supply Chain Management to Human capital management, complex algoritms to drive forecasting through to optimization, the embedded functional “intelligence”, delivers to business the right process to execute “competitive analytics”. What SAP hasn’t done well, the ability to execute or integrate these capabilities.

You keep using that word. I do not think it means what you think it means.“.

Advanced Analytics. These two words together, look weird, and make no sense. This was one topic I had started to write a long winded blog on, before stuff happened. For the most, its disappeared, for the better, really it just confused customers. Technologies used to execute many of these techniques/methods of statistical analysis, predictive analysis, data mining, and machine learning, has advanced in many ways. To categorize these techniques as advanced, made it sound like organizations were getting more now then in the past.

BigData. The jargon and concrete definitions of “what is” and “what isn’t” ensues. Rather focus on practical use cases for the technologies in the BigData space, and solving issues that current exist with the tool set, we want to tell people their data doesn’t fit the problem. If one is prevented from turning the data into into actionable intelligence in a required period of time (latency) due to the volume, velocity or the structure of the data (multi/poly), then there is a big data problem to deal with. The complexity factor isn’t so much the data, but rather the analytics, calculations, processing that needs to be performed.

The significance to me about BigData technologies is the problems I see that can be solved, problems I’ve been faced with, and growing problems, building innovative markets, and yes its more then a “Social Kitten” tool. A blog post to come on this topic.

Complex, statistically improbable things are by their nature more difficult to explain than simple, statistically probable things. -Richard Dawkins

The real problem in analytics. Its become painfully obvious the disconnect or confusion in the processing and understanding of the data, is the misalignment between analysis and synthesis. The focus on breaking down data into granular parts, identifying the patterns, to quantify and connect these findings into a drawn conclusion (i.e. sales results were down). Its the next step in learning, how do all these parts work together? When combined/brought together what new concept/measure do we realize from it. The technology is there today (machine learning algorithms, map/reduce) taking many data parts/sources, and bring them together, coming up with a new solution/finding completing the decision making cycle.

I got most of what I wanted to mention, I need to reformat the blog, add a roll of more enlightening reads then mine, and share more details of what I’ve learned.

Measuring your Measures a BI Afterthought

“An unsophisticated forecaster uses statistics as a drunken man uses lamp-posts – for support rather than for illumination.” – Andrew Lang though unattributed

A recent blog post highlighting the importance of a Business Intelligence Competency Center (BICC)shed some light on establishing value and trust in the data we use to measure results.

The difference is in the “how” an organizations comes to this realization that a business strategy must be defined to direct the BI initiative to attain any value ongoing. Partial failure identified early on; can be a invaluable source to the needed measurements and process monitoring; prolonged and the process will reach a state of total failure, leaving little remnants to gather value from.

Absence of Strategy?

An all to familiar reoccurring task for many; a multitude of reports are whittled down to the required columns, data manipulated and reformatted in Excel, distributed to stakeholders lacking appropriate documentation to support the question being answered, suspect data quality, and the data produced now in doubt lacking in value and context misinterpreted.

or

Resembling the Sunday Times; the same reports are bundled up and circulated among the stakeholders who individually analyze the data, duplicating the efforts in manipulating, summarizing, leaving the data and its message in a state similar to what Dante described in his journey through the 8th Circle.

Is one more detrimental over the other to the business? While the first scenario the data summarized to ease consumption at the cost of data lacking context, leaves its audience to sort out the details resulting in confusion of repeated manipulation and messaging. The conclusion though; identical; neither provides for the beginnings of; nor builds upon a sustainable business intelligence strategy.

So why and how does the process or lack of one get this point? Simply stated; either the question was not sufficiently answered or lack of understanding to the question being asked. The business may be efficiently utilizing the data; and IT is effectively maintaining all the pieces of the BI environment; what has been lost is that point of reference that could curtail the bad habits that can form when people attempt to consume data. It now becomes a lesson in improving the organization understanding, communication, and measuring of what can be and is being done with the data today.

Fostering a BI Strategy

A fresh approach or fixing whats in place? While new does have its advantages; either stage must still identify and shape the strategy from the business goals and needs captured. Its crucial that these goals become the blueprint for data democratization; and the outcome leads to empowering the organization with the ability to learn from the data, process transparency, measuring the effectiveness of the answers, and establishing trust in the data produced.

STATIC it is not. Goals, questions, and measurements will evolve over time; avoiding the scenarios above; the strategy must be dynamic to optimize and adapt as well. This is not a list of tasks, capabilities, or a technical description of the BI environment; and while there is starting point there is no predetermined end point.

TECHNICAL it is not. To embed technology or solutions into the strategy would only inhibit its growth; promote data silos; and lead an organization down the road to failure.

ORGANIC it is. Its critical that the process, data, and effectiveness of the measures are made transparent; this will enable to build on the foundation from the ground up. Questions will be evaluated and determined to be inefficient, redundant, or longer applicable. From this new goals, measurements, and data will emerge.

ITERATIVE it is. Whether starting off or salvaging what exists; the design of the strategy and how it is executed must be agile. Either by focusing on specific business problem or question (i.e. sales and opportunities) to test out how the strategy takes form. As the process matures; by measuring and tracking the consumption of measurements and questions; will identify what questions can be weeded out based on a state of inactivity.

Transitioning Strategy to the Tactical

Where should the strategy live? A document or collection of guidelines it is; but it also must be measured against and updated, collaborative, and transparent to all stakeholders. It must be able to collect and capture the required data, mapping the relationship and work flow between goals and questions asked, performance indicators and measurements.

Reconcile, reconcile, and reconcile. An exercise in data discovery; reviewing all available reports, identifying redundancies,asking whether the report is ever consumed, undocumented calculations and measurements, and connecting the dots back to a specific business question or process. Purpose here is to reduce output, bring meaning to the data produced, and make what exists usable.

Constructing a Data Presentation Architecture program going forward. A mix of business intelligence roles and skill sets; it directly contributes to the optimization and success of the BI strategy. Its designed to bring relevance to the data, communicate meaning, and produce actionable results.

Both the growth of data and the need to analyze the information is increasing daily. As well it will be imperative to measure, validate, and adjust with this growing need; to ensure the right data is being applied to the right question.

BI Does Your Organization Good

I assume most have seen either Milk industry’s TV commercial or print ads declaring that “Milk does the body good”. The particular commercial that caught my attention was a family playing football and began to deflate or break apart; point being without milk bones become brittle.

It got me thinking; does the same rule apply to organizations and business intelligence? Do operations become brittle when data isn’t analyzed right; or worse; not used at all. While most organizations are using some form of tool(s) to analyze the various moving parts of the business; response to new data requests is slow, the usage rate of these solutions are low due in part to functional complexity; and much of the time the results are maintained in silos. These factors contribute to the how organizations become brittle to reacting to events, understanding the data, and sharing the information.

Indicee; a SaaS based BI solution has taken to solving this brittleness factor that plagues organizations; by providing a cloud based platform to manage the various sources of data, increasing user interaction, and innovative ways to analyze and understand the data.

Using the content from existing reports and spreadsheets; Indicee gives the user an intuitive self-service workflow to mine and relate data. This mashing up of data from the various applications and spreadsheet-marts inside the organizations strips down the barriers of the otherwise data-silos. This step through do it yourself approach (DIY); lets users to load, identify columns of interest, create measurements, and understand source data relationships; in what I would describe as a guided wizard; with ease and flexibility. This gathering and data learning process can be performed in under 30 minutes; which is about the equivalent time allotment for the daily status meeting in a traditional BI project.

Now that Indicee understands your data; ongoing report updates grow the data mart. That slow response rate to change that typically breaks organizations; Indicee gets the problem; no reload of data here; rather Indicee recognizes the change; asks how it needs to be included to the fold; and this data is now part of your ongoing analysis.

Intuitive question interface

It’s about learning the data right? It’s at this point typically that continued adoption comes in question, and brittleness in the data begins to appear; as users reconcile what was needed versus what is available. Worse is the classic BI user interface made more for creating Visio diagrams then asking business questions; muddied with drag-and-drop functions, and inquiries returned in the form of SQL. Indicee has lived up to the principle; find simplicity in the complex; remove the noise, and let me use the language I speak to ask the question. Described as an “Intelligence Question Interface” has put some UX into the UI; asking the “What” I want to measure, “How” I want to organize my data (date, geographic, product), and “Filter” this from the information requested. This textual process constructs a sentence; that describing what information will be retrieved.

Users can then turn this question into a report, a report into a chart, drill-down into the details, sort, or even alter the original question. Need to monitor a collection of metrics or KPI’s; reports and charts an be presented in a dashboard to track performance across the organization. Need to add this output to a document or presentation; simply export the report to either an Excel or PDF file format; even include the question asked without having to embed a decoder.

Email; ideal for sending out task or meeting requests; for sharing information; inefficient and lost in translation. The purpose of Business Intelligence was to produce for others to consume; share and collaborate on results; for the information to really have value it must have context. Indicee has progressed in this next step of BI evolution; providing a platform where information to be shared among groups, allows make and view comments on reports. Simply put strength in numbers adds strength to your data; reducing the brittleness.

Thoughts

Indicee is new; there are trade-offs; the strength of its platform is about enhancing end-user data visibility; rather than the building on the foundation of bells and whistles that many traditional BI tools offer. My initial impression were that the data functions provided were light; my questions were being answered; and the results were accurate; which matters the most.

The company is developing a strong ecosystem through its VAR channel; which in the long-term should increase adoption and yield deeper value to their customer data. The current approach in offering called “Quickstarts”; a predefined set of data marts for a number of popular accounting packages; is a welcome contrast in the SaaS BI space that is perceived as to Salesforce (SFDC) centric.

Initial sign-up is simple; based on a freemium model; that starts off with a 30 day trial – if you don’t use more than 10 Mb of data, it stays that way. As your data requirements grow, the cost scales: $69/mo for 100mb. Want more users? $149/mo for a workgroup (5 licenses). I would describe the process as more user buy-cycle driven.

Disclosure: I have been paid for past consulting work at Indicee; though no longer a client

A Database By Any Other Name

Creative forces launch innovation; provide alternatives to a market where choices may not be all that dissimilar. This couldn’t be more apparent with this current range of alternatives; BigData, NoSQL, or the analytic appliance (ADBMS); looking to displace the defacto RDBMS. The King is dead…long live the new King or as some call it cyclical history in progress.

It’s not that this current disruption isn’t lacking in technical or business rationale; rather its the misguided approach some have taken to by ruling out the applicability of the traditional dbms . The rdbms space over the past 20+ years has become dominated by the likes of Oracle, Microsoft, and IBM; who have exerted a sense of entitlement in creating a “one size fits all” offering to serve a broad range of applications.

This path to generalized optimization has manifested into bloated functionality; and worse yet into scalability constraints all at the expense of the captive user community. The cumbersome and primitive methods of data access and storage needs that gave way to the mainstream adoption of the rdbms is not that far removed from what is motivating organizations to seek out alternatives today. Though it may be worth a momentary step back; to understand the real problem; so we don’t reconstruct what we are trying to replace.

MySQL gave organizations that first sense of freedom as a low-cost, lightweight alternative that could be supported on commodity hardware platforms. For many this removed the licensing constraints of the enterprise dbms; that inhibited them from addressing growing performance overhead on their transactional systems. Moving this activity to what was being termed a “distributed server farm”; to perform tasks like ad-hoc querying, data mining, and online shopping. Innovative at the time; and compelling to many clients I worked with.

Point of clarification; when I refer to commodity hardware; I’m not talking about clustering together recycled 386 desktops; but moving away from supercomputers that scaled up; and to clusters of physical hardware that can scale out utilizing more elastic methods like virtualization.

BigData

The “BigData” problem being bantered about today is not a new problem; many organizations have been working with and analyzing terabytes and petabytes of data for some time. AT&T and Bell labs developed the Daytona project outside the confines of the a traditional dbms to tackle the analysis of their growing volumes of call data.

What this is becoming more about the is the openness and availability of the data; along with the increasing number of sources/devices generating steams of data we will need to make sense of. Moreover; unlike the AT&T; organizations don’t have to build their own hardware or software infrastructure; the cloud combined with open source projects for the most part takes care of this.

Amazon supplies us with a public cloud; that gives us a scalable server infrastructure at a fractional cost, (BigTable) (MapReduce) (Google), Hadoop (Yahoo), Cassandra (Facebook), and Dynamo (Amazon) have made available the solutions used to manage their BigData problem as Open Source managed projects. Commercially; IBM has put its collective effort in working with large data problems; and generated plenty of talk around Bigsheets.

The Hadoop project in particular has garnered the most attention; built on framework of Hadoop file system and MapReduce for data processing; has evolved into an ecosystem that scales with are data storage and processing needs. From HBase for analysis to Hive, and Pig that simplifies data queries; Hadoop is on course to righting the data problem.

To learn and understand the data; the methods used to store and process it is critical. The school bell rang in my ear while I was prototyping statistical models against a number of medium-sized data sets; ranging from 500 GB – 1 TB in size; and contained 50+ columns per row. Added into the mix was R; an open source statical language; to process, data mine, and predict possible outcomes; to get to this point I needed to build a dbms, create data model, load data, aggregate data, extract and flatten data so R could process it…STOP and RESET

Refine the approach; and think about what my initial problem was again; understanding and learning the patterns in my data. I took the problem to the cloud and utiziling the Amazon EMR platform to load, process, and flatten the data. Constructed on the combined framework of Hadoop and MapReduce providing APIs that can be interfaced using Python to perform sort, compare, reduce, map routines, and output of aggregated results that I could then analyze in R. All completed with credit card in hand; for around 10.00 dollars and under 45 minutes of processing time.

I was able to solve my computational problem by prototyping on Amazon EMR in minimal time; and was isolated from many of the complexities and limitations that can exist with an internally managed Hadoop distributed architecture. Being a batch problem; this can exist and grow in the cloud for some time; a point will be reached where from a cost perspective to move to dedicated hardware; I guess that why companies like Cloudera exist to help organizations when that time arrives

My problem isn’t alone being outside the typical web data use case that Hadoop being applied to. BioTech and more specifically the bioinformatic activity is greatly benefiting from this framework to solve their growing data needs.

NoSQL

A movement to make a software engineer’s life easier when dealing with modernized data.

post-relational distributed computing; and as Micheal Stonebraker has described it breaking down the overhead associated to maintaining consistency (ACID) properties of a transaction.

Tossing out a legacy; overused; query language; that adds to the overhead of managing data transactions.

One and two; yes; three; yes and no; having dealt with point-in-time analysis using SQL is a good example of where alternative approaches are required. I’m impressed with how SQLStream is addressing this problem.

There are a number of alternatives being released to solve these data scaling problems; MongoDb (document) , Redis (key), and Neo4j (Graph) are vendors I probably get asked most often about. Each with their advantages, but more importantly bringing an open discussion to a hard problem. Consider how the alternative manages consistency, persistence, and availability of the data. How does that fit into the application requirements? The thinking that its open source and can be customized; will be a full circle trip to the same problems that rdbms would have caused.

I was introduced to the NoSQL term and more so CouchDB while on a data mining project. A case of two projects converging; one analyzing test results; another sharing the test results and supporting documents across a number of labs. The problem spelled out; high transaction volumes; and data structure that didn’t fit into a relational model without frequent altering.

A sought out opinion became a quick introduction to the CouchDb architecture:

Not a rdbms, distributed document database, JSON interface, schema free, maintaining both unstructured/structured data, and data access through a number of open source methods.

I ask:

What are they getting that a traditional dbms can’t provide and what are they giving up.

In an environment with a number of research documents along with 100+ million rows of generated test results that needed to be accessed more frequently; and structures that could change based on the test being executed. Managing and constant updating of a relational data model wasn’t in the plans; nor was the resource to take this on

The CouchDB platform gave them a distributed/replicated data environment to handle requests as well as transactional volume; and while it does preserve the ACID properties; ensuring both data consistency and availability; there was willingness to working an eventually consistent state (BASE) for the performance gain. This wasn’t a dbms replacement; it was a new problem that until now couldn’t be solved easily.

Analytic Database

Disrupting a space that has long been dominated by Teradata; this new breed of DBMSs have set out to change the scale of economies for many organizations needing to perform analysis and proactive learning on their BigData.

Recently it was described as over-crowded; I like the competitive field and find that it produces much more open discussion and innovation for that matter. Leveraging the either scalability of standard commodity hardware or optimized devices for performance advantages and built on some foundation of the PostgresSQL open source dbms engine; Netezza (relational), Greenplum (relational), Vertica (Columnar), Aster Data (Relational), and ParAccel (Columnar) are now the growing forces in the space.

Leading way to a steady stream of product evolution and innovation; massive parallelism, in-memory processing, integration of the Hadoop and MapReduce framework to minimize processing data, deployments of their platform in the cloud, supporting solid-state storage (SSD) leading to significant performance gains in data access, and reducing the data movement bottleneck with hybrid data-application server introduced by Aster Data that embeds processing logic into the database engine.

Much of the focus to date has been on reducing the access and processing time on the stored data; at some point attention will need to shift to figuring out how to efficiently handle the blend of loading large batch files with real-time data streams.

Co-existing going forward

Neither the rdbms nor SQL are going away anytime soon; for the simple fact that these alternatives were not built as replacements for the dbms but rather address new data problems. I wouldn’t recommend pulling the rug from under the dbms managing your organization’s financial or ERP systems; there is plenty of best practices around addressing performance concerns with these applications. Most replacements today is due to standardization to a single dbms platform.

I see the offerings being developed by RethinkDB and Drizzle; that is lightning the MySQL framework to improve scalability in their product; as the model or approach going forward. Functionality in these alternatives will be adopted by the dbms vendors to address shortcomings in their own products. The MapReduce model is now being included in Teradata and I wouldn’t be surprised if Oracle released some integration into the MapReduce/Hadoop framework to meet customer demand. The addition of schema free support to MySQL may have changed the direction of the referenced project above.

Many of the NoSQL vendors are wrapping a SQL like interface around their platform to simplify access; but here how much functional bloat can be taken on; before the platform deviates from its applicable use. Versus getting caught up in the CAP theorem discussions or worse the coolness factor; remember this is about living with trade-offs and labeling data as a commodity.

The Future’s So Bright, I Gotta Wear Shades

For the independent record labels I considered the 80′s to be that combination of the golden age and end of an era. The rising economics made it impossible for them to retain their artists and produce records without partnering and eventually being acquired by the larger label. What made these independents different; closer to the grassroots scene, the willingness to take on the progressive and innovative, lacking the generic feel one got from the likes of EMI or Sony.

A reflection of the past decade and how the theory of “Creative Destruction” has shaped a climate more suited for innovation today; reveals a shiner outlook then bestowed the 80′s independent labels. Joseph Shumpeter; an early 20th century economist; popularized and further developed the theory where the the process of intnnovation and progress destroyed and replaced the old with the new.

From an perspective of measured and record economic indicators these past ten years have been described as the “lost decade” rightly so. Just look at the employment indicator in 1999 estimated 108 million employed (private services) to 2010 where yes its at the same 108 million; due in part to the combined loss of net worth caused by both the dot-com and financial bubble bursts estimated between 14 – 16 trillion; though there are positives to show in between the final numbers are what matters. From an innovation perspective though, we learned and created a collection of more leaner, agile, and sustainable environments keeping future in mind this time; providing a little light in the tunnel.

The Beginning

At the start of the decade the internet had become THE platform and business model; it had disrupted and changed the way people were now searching, doing business, and of course shopping; there was no question on the longevity of this boom. E-commerce was now the buzzword; Amazon and eBay were changing how we buy and extend our outreach to sell; and Paypal was disrupting how we were performing the simple task of commerce on the internet. Even as the Webvans and boo.coms; who collectively took in 400 MM in capital investment in order to innovate; would start to show cracks; there was no looking back or stopping the progress in motion.

These collection of websites were being designed and rolled out on best of breed software platforms from ATG, Vignette, and Broadvision; which were created to ease the development and management efforts required. Weblogic coins the phrase “Application Server”; building a common API layer that allows communication between the database and the application; disrupting not only application deployments on the web; but more so putting the concept of client-server to sleep for good. This progression was all in part to influence of Java in the development space; making the hardware layer transparent at execution; allowing for fewer implementations and displacing and disrupting C/C++ in future application development.

Along with this rise interest and activity of the internet dramatic growth in data generated occurred and the need to act on it in real-time became a nescessity. Much of the efforts to date to analyze this semi-structured data would be considered primitive, time consuming, and no where close to real-time. Personify; a platform I had experience with; would be a pioneer on how we would interact with this deluge of data, relate with our online shopper to understand their behavior, and further insight into why Ryan just left his shopping cart in the middle of the digital aisle and walked out the door. Much of what they did gave way to real-time analytics.

The B2B landscape was partaking in this change as well; the web was becoming the presentation layer of choice, and gave way to efficiencies in business processes. Ariba; innovated and transformed how we did procurement; providing a transparent landscape to work with vendors; wanting to differentiate themselves from ERP and SCM; created the term Operating Resource management (ORM); and started to carve out a market from the leaders; Oracle and SAP. Integration had a new buzzword; EAI; which provided organizations the capability of sharing information from the newly acquired and legacy applications; with minimal customization; from the likes of Tibco, Webmethods, and Seebeyond.

Reality Returns

So what went wrong; a topic that has had its share of talk time; bottom line this innovation came at a cost. The simple economics demonstrated that the costs associated to building out and maintaining this infrastructure (software, hardware, hosting) was not supported by the revenue these start-ups were able to generate. Though the wheel of progression was slowed; it had not stopped; a number of start-ups would cease to exist either from the lack of viability of their business model or a victim of the fallout, investors were left to lick their wounds, but through it all this didn’t deter those to continue forward. Creative destruction; thought to be stalled; was already paving under an industry barely out of infancy; the future outcome being the scrutinizing of costs and validity of business models. The sun would rise the next day; and the future was beginning to look okay.

The Looming Clouds (of change)

Emerging from the rubble; Google was beginning to evolve and disrupt; as described to me as not just a search engine; but as a incubator of people and information; which to date Google has been quite successful at accomplishing, Apple would deliver an iPod and iTunes redefining the user experience, Amazon survived to become the A to Z, a platform for third-party sellers, release the kindle, and provide the technology for the next cog in the wheel of progress; referred to as the cloud. The internet itself was starting a new phase in progression; the rise of social started to take place; with MySpace, YouTube, blogging sites, Facebook, and Twitter. This shift brought with it a more diversified use of the web; a place for community interaction, personal productivity, entertainment, and cost-effective environments where organizations could run their business operations.

Disruption would come in the form of economics; how do we get costs down so we can realize the value; Linux would give us an operating system to work on commodity hardware, MySQL would provide us with a database that was free, minimal footprint, and fast enough on the read, Apache the platform to run the application, and PHP the ability to script the code. Each of these developed separately; bound together to form what would be called LAMP stack. This would make the ability to create applications economically manageable, simple and quick.

The barriers present to cost of entry had started to come down; as did initial investments made to these new breed of startups, all which made for a leaner approach of doing business. With this evolution in technology and the agile approaches progressing with it; one would believe the methods organizations implemented to examine their operational data, better understanding of their customer’s behaviors, manage risk, and measure financial performance, would have been part of this forward progress made. But upon further inspection of how these organizations were using business intelligence to support decision-making activities; the same problems were still being discussed and encountered in 2010 as if it were 2001. The functionality being added by the Business Intelligence vendors during this period of time was considered more style over substance and was not solving the ease-of-use or providing data transparency. To further suppress any possibility of innovation; in the past decade; the market leaders would be digested and integrated into the software portfolios of the Oracles, IBMs, and SAPs.

The perception moving forward was that business intelligence would regress to the point that it would be nothing more than a collection of empty buildings collecting maintenance; couldn’t have been further from the truth. Reminiscent of the same cultivation that propelled change in the earlier part of the decade; newcomers have started to emerge fresh with innovative ideas disrupting how we think and can perform business intelligence activities in these shaky economic conditions.

Employing a proven and now mainstream combination of a SaaS and scalable-cloud model that had been vetted over the past few years by the likes of SFDC, Workday, and Netsuite; these SaaS BI vendors have witnessed a swift acceptance to business intelligence space. These emerging vendors look to distance themselves as replacement for Business Objects, Hyperion, and Cognos by causing disruption in terms of economics and functional use. A current roll call of vendors in the SaaS BI space includes; Birst, Good Data, Indicee, Pivotlink; and while there has been recent fallout of earlier pioneers LucidEra and Blink logic; these existing vendors have used this to their advantage by adapting business models and still delivering innovation. Providing users with a platform that offers a higher level of user experience, the beginnings of a collaborative workspace , ease-of-use, and more user driven self-service options; while lowering the costs, resource requirements, implementation time frame that have plagued the on-premise tools.

The adoption of SaaS as a deployment platform has extended beyond BI to the analytical functions that are aligned closer to the business operations business from sales and supply chain optimization, forecasting, predictive analytics and performance management. Offerings from Right90, Steelwedge, Aha, eVia, Xactly, Adapative Planning, Appian, and Host Analytics are the beginnings of a unified approach to monitor the enterprise with the same advantages of the SaaS BI vendors. What can the combination of these analytical functions in a single ecosystem do in terms of disruption? To paraphrase Ray Wang; an industry analyst; “The biggest benefit of SaaS is not the software; but the collective intelligence of the network. SaaS moves to information brokering” together with a recent blog post from Boris Evelson; an analyst with Forrester Research; discussing the estimated cost of producing a business intelligence report; plenty.

Those vendors who design and deploy from the beginning on cloud scaling architecture and a common platform (PaaS) that permits for open and efficient data integration; in turn change the economics of scales associated with the costs development and support and the on boarding at the same time creating an ecosystem that opens new channels for revenue that traditional business intelligence solutions will most likely never have access to. When aligned; all this could produce most significant disruptor for business intelligence; the introduction of context to the data being analyzed; leaving behind the days of a generated report open for interpretation.

As with any emerging technology; its not without challenges in the adoption into the mainstream; and as bright as the future is for SaaS BI will be; it will come at a price. Technically; there are concerns related to platform stability and security that need to be addressed, operationally the cost of sale needs to be monitored closely along with the pricing strategy, and the concept of Freemium needs to become either a limited or fading trend for these vendors to survive. There most likely will be further fallout as these vendors with similar functionality start to differentiate themselves, those that will survive will need to develop strong ISV relationships to extend value and revenue models, and acquisition looms as the on-premise vendors want to establish themselves in the SaaS space. However; the true measure of success will be how much further these vendors can penetrate the 25% user adoption rate in the existing BI solutions used by organizations today.

Even with these challenges; the climate is much better for these vendors; compared to the dot-com startups of a decade ago; based on the economies of scale alone. The upfront costs in the development together with the ongoing management are much lower for these vendors; allowing for further innovation; which should result in increased value for their users and a revenue channel with further reaches. These factors should translate into a platform that will allow organizations to ask more questions about their data and closing the gap between the event and the actionable decision. Those ripples will be disruptive.

As for my I.R.S records t-shirt or pets.com hand puppet; reminders of what were; instead see these vendors leading the way in realizing the goals and intentions written by H.R. Luhn in 1957 on using business intelligence to support decision making activities

The Vision

Lofty title

I just needed that inspiration to start it all off; yesterday I read a posting by Fred Wilson on the importance of having an online brand; and utilizing a blog as your resume. So; no more excuses.

When I set out to start this blog; my intention was to provide deeper insights along with my opinion of course on how people collectively in organizations examine data produced from the operational activities. My thoughts on what strategies and practices are acted on to make data useful in the decision-making process, methods of presenting data to a community, and why at times best intentions and message gets lost along the way. All with the purpose of presenting the innovation and potentially disruptive activities that are driving business intelligence and the analytics space that could either change or make the above points irrelevant.

Topics may vary as the blog evolves; and will cover a range my interests and how it relates to the theme of how we use data; from innovative business intelligence/analytic products , data mining and visualization, data management, economics, sports, cycling pop culture, trends, statistics, mathematics, molecular biology, open source, design thinking, digital media, venture and start-ups, music, coffee, wine, food, and my future intentions.

Comments and opinions are of course welcome and encouraged; after all I jumped feet first into the waters; and there will be times I deserve to be handed the toaster.