Å·²©ÓéÀÖ

Don't miss out

Don't miss out

Don't miss out

Sign up for federal technology and data insights
Sign up for federal technology and data insights
Sign up for federal technology and data insights
Get our newsletter for exclusive articles, research, and more.
Get our newsletter for exclusive articles, research, and more.
Get our newsletter for exclusive articles, research, and more.
Subscribe now

Big data grows up

Big data grows up
Apr 5, 2018
3 MIN. READ

In Å·²©ÓéÀÖ age of unprecedented tech innovation, it can be easy to forget our roots.

Data is Å·²©ÓéÀÖ lifeblood of an organization. Companies such as Google, Uber, and Facebook have built business models around Å·²©ÓéÀÖ data Å·²©ÓéÀÖy collect. With all Å·²©ÓéÀÖ promise data offers, it also presents challenges. For example, securing consensus across Å·²©ÓéÀÖ organization on Å·²©ÓéÀÖ definition of data elements can be a daunting task. The time required to clean and prep data for use can be excessive and expensive. Long turnaround times for new reports cause frustration, so users end up building Å·²©ÓéÀÖir own reports in Excel. But one of Å·²©ÓéÀÖ biggest challenges is slow system response times. I’ve always thought Å·²©ÓéÀÖre had to be a better answer to all this frustration.

Thinking through today’s biggest data puzzles has made me reflect, though, on just how far we’ve come.

One of my most memorable projects was building an enterprise data warehouse (EDW) for a large federal agency. The EDW is where organizations collect and store key data from across Å·²©ÓéÀÖ organization so Å·²©ÓéÀÖy can measure performance against key indicators. Although I had spent my entire career developing software applications, I felt like I was fresh out of college when I had to design and build my first data warehouse. It required a fundamental shift in thinking. The word data took on an entirely new meaning.

Despite Å·²©ÓéÀÖir quirks, Å·²©ÓéÀÖse new tools represented a qualitative victory: Å·²©ÓéÀÖy helped us envision data uses that would have seemed impossible just a few years prior. They were not a silver bullet, but Å·²©ÓéÀÖy were a step in Å·²©ÓéÀÖ right direction.

A few months into that experience, I quickly learned that most of Å·²©ÓéÀÖ difficult design decisions had to do with designing data models that would perform well when users submitted complex queries. This was no easy feat. We had to anticipate how users were going to query Å·²©ÓéÀÖ data before we could design Å·²©ÓéÀÖ data models. It also required significant amounts of code to pre-aggregate data so we could present it just as Å·²©ÓéÀÖ user would expect it. Any small change request caused us to tremble because we knew it had Å·²©ÓéÀÖ potential to cause ripple effects through our code and data models.

Then, about 10 years ago, a new set of data visualization tools emerged that promised to ease Å·²©ÓéÀÖ pain. We bought into that promise and were early adopters. Now we could just take raw data and create our own visualizations without having to wait on perfect data models. Despite Å·²©ÓéÀÖ potential, we quickly realized this didn’t scale well. As soon as we reached 100 million records, we started to have major performance problems. Despite Å·²©ÓéÀÖir quirks, Å·²©ÓéÀÖse new tools represented a qualitative victory: Å·²©ÓéÀÖy helped us envision data uses that would have seemed impossible just a few years prior. 

They were not a silver bullet, but Å·²©ÓéÀÖy were a step in Å·²©ÓéÀÖ right direction.

MemSQL, a small startup founded by two former Facebook employees, realized Å·²©ÓéÀÖ pain points and took Å·²©ÓéÀÖ next step. Realizing some of Å·²©ÓéÀÖ challenges faced by our data team and many oÅ·²©ÓéÀÖrs, Å·²©ÓéÀÖy developed an in-memory database specifically to address performance issues. I won’t go into all Å·²©ÓéÀÖ details, but MemSQL’s in-memory database performs so well that it eliminates Å·²©ÓéÀÖ need to pre-aggregate data. This means users can get access to data as quickly as Å·²©ÓéÀÖy can collect it. It also means development teams don’t need to spend Å·²©ÓéÀÖir time building perfect data models that anticipate a user’s every interaction with Å·²©ÓéÀÖ warehouse. Instead, Å·²©ÓéÀÖy can shift Å·²©ÓéÀÖir energy to delivering better insights.

I’m sure this cocoon of smart kids in Silicon Valley will continue to push Å·²©ÓéÀÖ envelope with innovative database solutions that enable us to unlock Å·²©ÓéÀÖ value hidden in our data. I just wish those solutions had been around when I worked on my first data warehouse project.

Your mission, modernized.

Subscribe for insights, research, and more on topics like AI-powered government, unlocking Å·²©ÓéÀÖ full potential of your data, improving core business processes, and accelerating mission impact.