Do No Harm. Matthew Webster

Do No Harm - Matthew Webster


Скачать книгу
Years later, there is relatively little oversight of big data despite these not so insignificant problems.

      What makes data brokers more interesting is that today we have more data than at any time in history. The digitization of information makes it far easier to stream data all over the planet in a comparatively short period of time. Sifting through that data is also far easier, which means corporations, governments, and people have that information. We truly are in the era of big data.

      Volume is really critical to big data because the more information you have, the better chance you have of having a particular piece of data. Think about it from a COVID-19 perspective. If you had only two people and those two people died as a result of COVID-19, you might come to the erroneous conclusion that COVID-19 was 100% fatal. While that example is absurd, having a large volume of data helps to weed out the statistical improbabilities that a small volume of data might indicate. The larger the data set, the more reliable that data tends to be.

      Velocity, generally speaking, centers around the analysis of streaming data. The more sources of information—the more sensors that are on a person (patient or not), the better overall picture the data brokers or hospitals concerning the person or patient. The more real time the data is, the more useful that data can be to an organization because near-real-time judgment calls can be made. The store-and-forward technique discussed in the previous chapter means that decision making has a lag and may not be as relevant depending on the circumstances. When we talk about the instantaneity of the world, this is what people are talking about.

      Variety is also key from a big data perspective. Having a single type of data source is good, but having more data sources is even better. Let us use COVID-19 data as an example. If all we had was the data on young children, our view on the disease would be different. We know that it disproportionately affects the elderly in terms of severity. The larger the variety of sources, the better analysis we have overall.

      Veracity pertains to the accuracy of data. If our data set was very diverse when analyzing COVID-19, but it was wildly inaccurate to the point where it looked like everyone was affected the way the elderly are, we probably would be taking very different actions. Having accurate data really matters. If any one of the four Vs fails, we are provided with less than optimal information.