Category Archives: Data Science

Data Tells a Story: Facebook for whales; legal data; catching health code violators

whaletail

We at Inform believe that data tells a story, across all industries, and every week we round up the most interesting ones. This week: Facebooking the whales; cutting down on legal bills; and catching health code violators faster.

Making Facebook for Whales

Keeping tabs on and identifying the 500 North Atlantic right whales left in the world is an arduous process. Researchers must take aerial surveys, then painstakingly comb through a database of images and identify each whale individually. One biologist, Christin Khan, turned to Facebook for a solution.

We all know that the social media platform can (creepily) identify us by face. Why not develop a similar algorithm to identify right whales by their distinctive markings? Khan held a contest tasking data scientists with creating such an algorithm.

The winning team’s algorithm is able to identify whales “with 87-percent accuracy.” Such a development is useful to scientists for multiple reasons. Those who study whales’ genetics will know right away if they’ve already taken a biopsy from a particular whale, and most importantly, it will “free up researchers’ time to do actual research.”

How Big Data Is Disrupting Law Firms And The Legal Profession

While the legal system generates huge amounts of data, little of it is used beyond billing, time management, marketing, and customer relations  — in other words, it’s not being used in the real practice of law. One legal research and analytics firm wants to change that.

LexisNexis and Westlaw provide a wealth of legal data. However, they are mostly used as search engines. The firm uses an analytical algorithm to draw often surprisingly insights. For example, they can scan through judges’ decisions and determine which might be the most sympathetic for a particular case. They’re also teaming up with Harvard Law School to digitize their entire U.S. case law library, and make it publicly and freely available by 2017.

Using data in such away may lead to greater efficiency, transparency, and accountability. Legal costs could be reduced, as well as the need for time-consuming appeals and retrials.

Foursquare’s Plan to Use Your Data to Make Money—Even if You Aren’t a User

While at 50 million monthly active users, Foursquare’s apps might lag behind social network behemoths such as Twitter, Instagram, and Pinterest, the company is still putting all that data to use by developing an ad-targeting and location data business.

While many apps can access your GPS coordinates, more difficult is matching those coordinates with real places. Foursquare is doing just that using the “massive database” users have helped to compile over the years. As a result, for instance, frequent fast food visitors could be identified and served up ads for fast food chains or “perhaps healthier alternatives or gym memberships.”

How big data can drive employee engagement

One software company is looking to help employers better track employee satisfaction and therefore more accurately predict when workers might be thinking about leaving.

While annual performance reviews are valuable to an extent, they may not paint a real picture of an employee’s happiness or lack thereof. The company suggests that their software, which collects employees’ self-reported moods in real-time, would be more accurate. In addition to recording moods, the software allows both employees and customers to give immediate shout-outs to colleagues.

Chicago Is Predicting Food Safety Violations. Why Aren’t Other Cities?

Like most U.S. cities, Chicago inspected its eating establishments in the traditional way: scheduling inspections by going down a list. However, this method didn’t allow for those most likely to violate health codes to be inspected first, and time is definitely of the essence in such a situation. The longer a potential violator is allowed to operate, the more of a chance diners will get ill.

To address this, Chicago’s Department of Innovation and Technology built an algorithm to analyze the city’s publicly available data and predict which eateries would be most likely to violate health codes. Not only did the algorithm identify violations faster than the traditional way — 7.5 days earlier, to be exact — it was designed in a way that made it easy for other cities to replicate.

However, only one city has followed suit so far. We should remember that even with Chicago’s model, such an endeavor requires a lot of work. However, making the code and method freely available is a vital first step.

Data Tells a Story: helping the poor, predicting a hit song, defining happiness

singer

We at Inform believe that data tells a story, across all industries, and every week we round up the most interesting ones. This week: helping the poor, predicting a hit song, defining happiness.

An Army Of Data Just Joined The War On Poverty

To get a better idea of the needs of the poor in real-time and to direct resources more efficiently, university researchers and the Salvation Army have partnered to develop the Human Needs Index.

The index goes beyond just income or unemployment rates by looking at several measures of poverty based on services used and requests for assistance at Salvation Army locations across the country.

As a result, the index can show changes in regional or seasonal needs. For instance, the researchers noticed a big spike in energy-related needs in April. This was because utility companies are often forbidden from shutting off power during the winter months and simply wait until springtime to do so.

Cities Using 311 Data in Novel Ways Discover Drawbacks

Somerville, MA, New York City, and Chicago are just a few cities leveraging data collected from non-emergency 311 calls to make public works improvements. Somerville used its call-ins about rat sightings to cut down on the rodent population while New York uses 311  data to target unauthorized uses of apartments. Meanwhile, Chicago is looking into restaurants with public health violations.

However, there are some gaps, and addressing the differing behaviors of varying demographics might address such gaps. For example, immigrants might be less likely to phone in non-emergency complaints; in some neighborhoods, people may choose to resolve conflicts themselves; renters are less likely to call in issues like damaged trees; and senior citizens might be more inclined to dial 0 as opposed to 311.

4 Ways Gyms Are Using Data To Drive New Members & Enhance the Workout Experience

Need help keeping that fitness New Year resolution? Some gyms are leveraging big data to help their members do just that.

Some customers are being encouraged to use wearable technology so that their stats can be reported back to their personal trainers while they’re not at the gym. Some gyms are creating their own apps, while others’ fitness equipment collects members’ data. Yet others are using their members’ habits and preferences to improve fitness programs, optimize resources, and better target ads and reward programs.

These Students Are Using Data Science to Predict Which Rap Songs Will Become Hits

Data science students at the University of California, Berkeley developed an algorithm to determine if there’s “a science to creating a blockbuster rap,” says Mic, and found that indeed there is.

The student researchers found that the biggest factor for a song’s success was the amount of profanity — the more the better. Another perhaps more surprising factor is variety in theme. A song that explores lyrical themes beyond the rapper’s lifestyle tend to be more successful.

The students also discovered the most popular locations, with New York, Los Angeles, Atlanta, and Chicago topping the list, as well as the most popular brands, including Bentley, Porsche, Apple, and Twitter.

What is #happiness? One Brooklyn artist is using Instagram data to figure it out

For 12 months, a Brooklyn artist collected 100,000 Instagram photos with the hashtag #happy and came away with some interesting findings. One is that selfies “are enormously popular with the hashtag #happy,” while another is that #happy photos often have “warm, muted tones.” Finally, despite the popularity of Instagram filter names, the most common one is no filter at all.

Data Tells a Story: train delays, medical decisions, snacking smartly

trainstation

We at Inform believe that data tells a story, across all industries, and every week we round up the most interesting ones. This week: predicting train delays; making wise medical decisions; and snacking smartly.

How big data predicts and helps prevent train delays in Sweden

The commuter rail operator in Stockholm is using big data to forecast and prevent delays.

Using historic data to look two hours into the future, their prediction model anticipates and acts on disruptions that have yet to happen. For example, the model may predict that a train will be 10 minutes late to a certain station. To avoid this, another train is sent to that station on time, avoiding a “ripple effect” of delays that will grow exponentially.

Big Data reveals the surprising profile of an ISIS recruit

In the light of the attacks in San Bernardino and Paris, one data scientist decided he wanted to do more than “pray and condemn the violence.”

Zeeshan ul-hassan Usmani poured over “data on ISIS recruits the way he normally analyzes data on consumers for major brands,” including social media posts and the cases of accused terrorists.

He came away with several findings. One is that there are over 70,000 people in North America, Australia, and Europe “ready to radicalize.” Another is that recruits are mostly young and male; more likely to be educated; and from middle or upper middle class families. They also don’t necessarily have a devoutly religious background but are more likely to have been secular and become radicalized.

In addition, he discovered what could be a connection between number of those ready to be radicalized and the prevalence of Islamophobia. For instance, he estimated that France has over 27,000 potential recruits (as opposed to little over 1,500 in the UK). France also has the largest Muslim prison population and has had 26 mosques vandalized since the attack at Charlie Hebdo earlier this year.

Using Big Data to Make Wiser Medical Decisions

In this article, a physician explores the different ways data can help patients better manage their health care. One way is through patient-generated data. Using data collected from a wearable device, Dr. Halamka tracked his own blood pressure levels and the possible causes, finding that his high blood pressure was most likely genetic and not caused by external factors.

Data can also help with precision medicine. When his wife diagnosed with breast cancer, Dr. Halamka was able to use open source software to assess the treatment of 10,000 women who fit his wife’s criteria and determine the best course of treatment (his wife is now cancer free).

Cruz campaign credits psychological data and analytics for its rising success

While Ted Cruz has spoken out against excessive government data collection, his presidential campaign has been actively collecting and analyzing data from supporters and potential voters to personalize messages, calls, and visits.

The data comes from a variety of sources including Facebook posts, buying habits, an app that keeps supporters “in touch” with the campaign while scraping their contacts, surveys of more than 150,000 households, and geo-fencing, geographically tracking people through their mobile devices.

From the collected data, the Cruz campaign, working with a data analytics firm, built several profiles, such as the “stoic traditionalist,” a conservative voter mainly concerned about immigration, and tailored messaging to those profiles ( “confident and warm,”  “straight to the point”).

Missives were also designed according to how people scored on certain attributes. Those who scored high on “neuroticism” would receive pro-gun messages emphasizing the use of weapons in terms of personal safety, while those who scored high for “openness” would receive a pitch on the idea of hunting as a family activity.

How Gousto is using data to change the way we shop for food online

UK-based startup Gousto makes cooking easier by delivering ingredients in a box. But they don’t just take orders: they ingest data to learn more about what their customers like.

The company built a data engine “to tag every ingredient and recipe to build up a network understanding” of their customers’ preferences. Their recommendation engine, dubbed “Laura,” analyzes millions of data points to predict what people like to eat and when.

Gousto’s tactic is similar to that of Naturebox, a U.S. startup that delivers healthy snacks and recommends snacks tailored to individual tastes based on an algorithm they developed.

Data Tells a Story: catching lies; fighting the flu; Chinese shopping trends

The Golden Week has begun, Ladies and Gentlemen

We at Inform believe that data tells a story, across all industries, and every week we round up the most interesting ones. This week: catching lies, fighting the flu, and what the Chinese are buying.

Lie-detecting software uses real court case data

How do you catch a liar? Humans are bad at it, say researchers at the University of Michigan, and perform only slightly better than a coin-flip. That’s why the team is using real-world data to build a better way.

The researchers’ lie-detecting software is based on data from a set of 120 video clips from “high-stakes court cases,” half of which had been deemed to be deceptive. To obtain the data, the audio was transcribed and the frequency and types of words were analyzed, as well as the number and types of gestures.

The software was found to be up to 75 percent accurate in identifying who was lying while humans were right only a little more than half the time. The software also discovered several “tells.” For example, liars were  more likely to scowl or grimace; look directly at the questioner (perhaps as a way of overcompensating); gesture with both hands; and use speech fillers such as “um.”

Scientists use big data to fight flu

Flu season can be deadly. In Switzerland, the flu virus results in as many as 5,000 hospitalizations and 1,500 deaths every year. So Swiss researchers, along with those from Germany and the U.S., are looking for a way to decrease those numbers.

After analyzing datasets from publications on the host molecules that flu viruses rely upon to replicate, the team “discovered 20 previously unknown host molecules that promote the growth of influenza A viruses.”

One of those host molecules is known as UBR4, which can help a flu virus replicate as many as 20,000 new viruses. The scientists discovered that blocking UBR4 prevents that virus replication and therefore “is feasible as a therapeutic strategy for the treatment of influenza.”

FBI to start tracking animal cruelty in 2016

While animal cruelty cases were previously placed in a general category in FBI’s National Incident Based Reporting System, starting in January they will be placed in their own specific categories, including neglect and intentional abuse, and will be classified as “crimes against society.”

Such a change is important not only to prevent cruelty to animals, but to predict escalating acts of violence. Previous research has found links between animal cruelty, domestic violence, and other criminal acts. Most recently, this pattern was found in the case of Robert Lewis Dear, alleged shooter at a Colorado Springs Planned Parenthood clinic, who has been accused of both animal cruelty and domestic violence in the past.

How IBM Is Using Big Data To Battle Air Pollution In Cities

Beijing recently issued its first red alert for pollution, and IBM is trying to use big data to remedy the problem very unhealthy air in China’s capital and other cities.

Using machine learning, data scientists will analyze the quality and accuracy of previous weather forecasts, and build improved forecasting models from there. In the past, when a city knew the source and amount of pollution in the air, the more likely it was to take action, resulting in lowered pollution levels and improved public health.

Ideally, as a result of such number crunching and analyses, cities like Beijing will have issued their first and last red alert.

Alibaba’s Consumer Behavior Data Reveals Trends in China

E-commerce behemoth Alibaba recently released its latest big data report on consumer habits.

Analyzing data based on the behavior of 300 million shoppers from 2011 to this past September, Alibaba came away with a several findings. For instance, they found that consumers were buying healthier, investing much more in purchases such as organic foods, healthcare products, and sports equipment.

They also found that those born in 1980s and ‘90s were the biggest shoppers, and, most surprisingly, that people shop much more during the Magpie Festival, or Qi Xi, a sort of Chinese version of Valentine’s Day that falls in August, than on Western Valentine’s Day, showing perhaps that “young Chinese people have started to value their own tradition.”

Data Tells a Story: crash test dummies; big pharma; the evolution of smiling

smile

We at Inform believe that data tells a story, across all industries, and every week we  round up the most interesting ones right here. This week: data driving crash test dummies; taking on big pharma; the evolution of smiling.

A Smarter Kind of Crash Test Dummy

While traditional crash test dummies can provide data on about 20 points on the body, says Technology Review, a new digital simulation can provide much more detail.

Based on five years’ of data collection on thousands of virtual crash simulations, information drawn from a database of injury research, and a digital model with 1.8 million elements on the human form, a research team at Wake Forest University has developed a digital crash test dummy which can test “a variety of body shapes and sizes and different body positions at the moment of impact,” and can “quantify the risk of bone fractures and damage to soft tissue and organs, injuries unaccounted for by crash test dummies.”

Car manufacturers are finding the data invaluable. While using actual crash test dummies comes late in the design process, manufacturers can use digital dummies very early on and make modifications to improve safety, which in turn cuts down on costs as well.

Deutsche Bank to sift ‘big data’ to get closer to customers

Deutsche Bank is upgrading their systems to leverage data in order to improve customer service and experience. Such an upgrade will “provide a detailed picture of how, when and where customers interact,” and allow the bank to see previously unseen patterns and gain new insights.

The data is often provided by the customers themselves, such as when and how they log in, products and services they use, when and from where they use the products and services. Insights into such data might help Deutsche personalize services according to customers’ specific needs, identify bottlenecks, and solve problems more quickly.

Big Data Predicts Centuries Of Harm If Climate Warming Goes Unchecked

To understand Earth’s complex climate and make predictions such as how greenhouse gas emissions will affect our future, scientists run climate simulations on thousands of linked supercomputers.

Measuring factors such as the amount of sunlight reflecting off sea ice and how the wind affects ocean currents, scientists have come up with a climate model that shows that if greenhouse gas emissions keep increasing, “the world will look different.” For instance, there will be “very little ice left in the Arctic” and New York might be as warm as Miami.

Can big data lead to lower costs for health care?

One data scientist is tackling the big issue of skyrocketing health care costs by taking a look at data around brand-name prescription medicines.

Using Medicare drug prescription data from 2013, he studied the number of times a drug was prescribed, the costs, and generic versus brand name costs, and found several patterns.

One was peer influence — that is, doctors are likely to prescribe the same drugs as their colleagues, which are often the more expensive brand names over generics, although generics have been proven to be just as safe and effective as their pricier versions.

Another trend found was patient-driven demand. Pharmaceutical companies are very good at marketing expensive branded versions as new and better while they’re no better than generics.

Data Mining Reveals How Smiling Evolved During a Century of Yearbook Photos

Until recently, data mining from photographs has proved to be difficult. The data set is immense, starting from the advent of photography 150 years ago, and the information often “too complex or too mundane” to put it into words. However, a machine-vision approach to data mining developed by a research team at UC Berkeley is changing that.

To test their method, the team tackled a database of American high school yearbook photos from 1905 to the present, and found, among other patterns, an “evolution of smiling.”

Right after the invention of photography, most opted for the more easily held neutral pose similar to that used for a painstakingly painted portrait. But as photography became more popular and Kodak advertised the idea of recording “happy memories,” smiling took over, and people began to say “cheese” over “prunes” when posing for a snap.

Data Tells a Story: holiday shopping, greenhouse gases, data fakery

Inside the Apple Store

We at Inform believe that data tells a story, across all industries, and every week we’ll be rounding up the most interesting ones right here. This week: helping your holiday shopping; reducing greenhouse gases; and finding data fakery.

IBM Watson Trend App: Big Data Meets Holiday Shopping

Overwhelmed with holiday shopping? An app built on big data might help.

Pulling in information from more than 10,000 sources, including “social media, major ecommerce sites, blogs, product reviews, and rankings,” the app provides the 100 most trendiest products in consumer electronics, toys, and health and fitness.

While some of the app’s findings are obvious some are less so. It’s no surprise that Star Wars LEGOS will be hot hot hot, but the app’s data says that LEGO may not be able to keep up with demand, and that those interested should buy early. The app also shows that smartphones haven’t killed digital cameras: Instagram has renewed interest in higher quality photography.

Jersey Utility to Use Methane Data Mapped by Google Street View Cars to Target Gas Line Repairs

New Jersey’s Public Service Electric & Gas (PSEG) has a three-year $905 million plan to replace over 500 miles of old, methane-leaking pipes, and is using big data to help determine the most efficient way to spend their money and time.

Through a partnership with Google Earth Outreach and Environmental Defense Fund, PSEG used a Google Street View car equipped with methane sensors to collect six months worth of data from thousands of miles of roadway, and from there will make decisions around scheduling and prioritization.

Methane is a greenhouse gas, and the hope is that such an effort will improve the environment as well as safety.

Stanford researchers uncover patterns in how scientists lie about their data

You can lie but you can’t hide.

Two Stanford researchers have discovered the writing patterns of scientists who lied about their data. To do so, they identified over 200 papers that had been retracted from science journals between 1973 and 2013, and compared the writing to unretracted papers in the same journals and time frame, and about the same topics.

Next they measured the “level of fraud” in the papers using an “obfuscation index,” which rated the amount of abstract language and jargon. The researchers believed that obfuscation of language is related to fakery in general, and that a scientist trying to hide fraud might want to “obscure parts of the paper.”

The researchers found that fraudulent retracted papers scored high on the obfuscation index, i.e., each had about 60 more jargonish words than non-retracted papers.

How sharing police data can improve relationships with communities

While some might think of using data to prevent and understand crime as a modern phenomenon, it actually goes back back to at least 1889.

African American journalist and activist Ida B. Wells examined 10 years’ worth Chicago Tribune reports on lynching, and found a pattern that was surprising for the time. While many believed young black men were being lynched for punishment of rape and murder, their “crimes” were actually not crimes at all, but were reasons such as “having a bad reputation”; “writing an insulting letter”; or nothing at all.

It’s a lesson that could be heeded today. Experts say that the media often focuses on one incident that “looks good TV,” while data provides a fuller picture. On the other hand, data is rarely neutral, and should also be viewed in the context of larger conversations about race and community.

In Bangladesh, a Half-Century of Saving Lives With Data

Matlab, the name of both a region and research site in Bangladesh, has been collecting and analyzing census and health data from residents for 50 years, and as a result basic health has much improved. For instance, in the 1960s children in Matlab didn’t survive into adulthood, while now more than 90% do.

Using data collected from the residents, Matlab was also able to develop and test lifesaving treatments, such the low-cost oral rehydration solution for cholera victims, which ended up saving the lives of about 50 million people worldwide, as well as zinc for childhood diarrhea.

The data also allows for retrospective study. One group of researchers wanted to understand if malnutrition in pregnant women would affect their children’s health in adulthood, and indeed found that adult children of women who were pregnant during Bangladesh’s 1974-75 famine were three times more likely to develop pre-diabetes.

Matlab is now finding that it’s those non-communicable diseases such as diabetes, heart disease, and cancer that are the leading cause of death among residents. It’s up to future Matlab generations to find the cure.

Data Tells a Story: Formula 1, fighting obesity, helping schools

racecar

We at Inform believe that data tells a story, across all industries, and every week we’ll be rounding up the most interesting ones right here. This week: big data for speedier racers; fighting obesity; and helping schools help troubled students.

How Formula 1 Teams Use Big Data to Win

Formula 1 race cars are more than just pricey, speed machines. Nowadays, they’re giant data sensors, collecting factors such as the effect of stress and downward force on a car, brake temperature, tire pressure, and of course speed, and feeding that data back to analysts and engineers to measure how well the vehicles are performing.

The data is also modeled upon to obtain predictive intelligence on how the cars will perform in the future. In addition, vehicles are built for each track “based on historical data and simulations generated by the current season’s sensor data.”

However, data analytics aren’t the answer to everything. For example, it’s still impossible to  capture “an accurate sense of where the cars are laterally on the track,” as well as “how well a tire is gripping the roadway.” In those cases, the best sensor is still the human driver.

Coming Home: This West Point grad is using AI and Big Data for national security

Computer scientist and former army intelligence officer Paulo Shakarian is working on ways to use machine learning and big data to improve military intelligence.

While stationed in Iraq, Shakarian noticed that while intelligence workers like himself were tasked with analyzing all available data and hypothesizing possible courses of action, few actually had time for this in the midst of war.

This is where machine learning and big data come in. Shakarian’s research over the years has resulted in software used to detect IEDs, or improvised explosive devices, in Afghanistan; social media programs that help Chicago police fight gang activity; and a mathematical model of the behavior of ISIS.

Bringing big data to bear on organ failure

The big data approach to medicine is quite different from health care’s traditional method of posing a hypothesis, devising an experiment, and testing. With big data, it’s about discovery — in other words, collecting huge amounts of data and seeing what patterns it returns.

The latter is the approach physicist Plamen Ivanov is taking at Massachusetts General Hospital in regards to the way organ systems interact. He and his team are collecting “hours and hours of data on vital signs,” such as that from EKGs, EEG, and ventilators, and seeing if they can “tease out how organ systems communicate with each other and coordinate behaviors.”

If his project is successful, Ivanov imagines a new kind of patient monitor that instead of just measuring blood pressure, heart rate, and brain activity, would “track the relationships between key organ systems — alerting doctors to cataclysmic phase changes in human health before they occur.”

Fight Obesity with Data

A team at the University of Virginia’s School of Engineering and Applied Science is implementing a program that aims to capture data on environmental and behavioral factors that could contribute to childhood obesity.

Rather than relying self-reported or anecdotal data, the team is using an in-home monitoring system made up of sensors that monitor factors such as tone of voice, mealtime distractions, frequency of meals, and stress levels. The team aims to identify and model “preventable behavior patterns,” and if successful, plans to expand to other medical fields.

How the city is using Google Drive to revamp its struggling schools

Sometimes just the existence of a massive amount of data isn’t enough. Tools are needed to streamline the collection and analysis process.

Some New York City school systems are facing this challenge and a nonprofit is helping by providing a way for the schools to feed the “vast supply of data” from multiple databases and sources into a single spreadsheet, and then training school officials on how to track student performance and devise plans to address issues and make improvements.

Before the implementation of this new system, some school workers had the arduous task of printing out reports and scouring them for patterns; using antiquated systems that resembled MS-DOS and required the typing of four-letter codes; and comparing paper documents and highlighting pertinent information by hand.

With the new tool, schools not only save time but are able to make decisions more efficiently and based on data rather than guesswork.

Data Tells a Story: health care by the numbers; diversity in language; some real cash cows

cow

We at Inform believe that data tells a story, across all industries, and every week we’ll be rounding up the most interesting ones right here. This week: health care by the numbers; diversity in language; and some real cash cows.

Medical Students Crunch Big Data To Spot Health Trends

Big data is changing the way medicine is being practiced and the way medical students are learning.

As the result of the rise of evidence-based medicine, future doctors must learn how to manipulate and analyze sometimes huge amounts of data. At the NYU School of Medicine, first and second year med students are required to do a “health care by the numbers” project based on a database with over five million anonymous patient records. Included are the patients’ race and ethnicity, zip codes, diagnosis, procedures, and payments.

The students’ findings have been interesting. One student, in comparing the prices of a hip replacement surgeries through New York state with the cost of Big Whoppers, found that the cost of the surgery was even more inconsistent than that of the fast food. Another student came to similar findings regarding the cost of cesarean sections.

Neuropolitics, Where Campaigns Try to Read Your Mind

Some political candidates have taken a page from neuromarketing and are using technologies such as facial coding, biofeedback, and brain imaging to gather data on potential voters, and are using that data to hone their campaigns.

For instance, the current president of Mexico used tools “to measure voters’ brain waves, skin arousal, heart rates and facial expressions,” while in Turkey, the prime minister hired a neuromarketing company that found via tracking brain waves, facial expressions, and heart rates that the PM’s speeches lacked emotional engagement.

How ING Direct is boosting customer loyalty using data analytics

ING Direct is boosting retention by rewarding customers for their loyalty. However, the rewards aren’t generic. Using a combination of data analytics, insights, and communication, the company has devised contextually relevant offers targeted at specific customer segments, creating even more meaningful customer experiences.

Americans Speak Over 350 Languages At Home, Census Data Shows

The findings from the American Community Survey from 2009 to 2013 show the vast variety of languages in the U.S.: over 60 million Americans speak a language other than English at home.

Spanish is the most common with 37 million people, with more than half of those having learned English. On the other hand, only 40 percent of Vietnamese and Chinese speakers said they spoke English “very well.”

The survey’s data also showed, perhaps not surprisingly, that the biggest U.S. cities are also “the most linguistically diverse,” with 192 languages spoken in New York and more than half of L.A. residents speaking a second language. Also well-represented are speakers of Native American dialects with 150, including Navajo, Apache, and Cherokee.

How RFID Delivers Big Data On Cows And Milk Production

You might have heard of precision agriculture. Now there’s precision dairy farming.

Farmers in India are using data collected by RFID tags to track various aspects of their cows, including nutritional levels, how much they’re eating, and signs of disease, all with the bigger goal of increasing milk production and reducing the number of cows, which in turns improves the environment by cutting down on the amount of methane produced.

A challenge these farmers face is taking the time to analyze the scores of data. So if you’re a data engineer who has always wanted to live on farm in India, there’s probably a job for you.

Data Tells a Story: the war on obfuscation; weather hunters; surviving the zombie apocalypse

storm

We at Inform believe that data tells a story, across all industries, and every week we’ll be rounding up the most interesting ones right here. This week: the war on obfuscation; hunting for weather; surviving the zombie apocalypse.

A Mexican startup has pieced together the elusive data behind the country’s secretive drug war

The drug war in Mexico has been making headlines for the past four decades. However, what’s been lacking is consistent data that helps “Mexicans understand their country’s brutal cartels—or how effective their elected officials have been at combating them.” A start up in Mexico called Animal Político is trying to provide that data.

Animal Político has their work cut out for them. While a new transparency law was enacted in May, many documents still remain classified — 12 out of 15 million to be exact. In February the UN chided Mexico for its lack of precise statistics on kidnappings, especially significant in light of the kidnapping of 43 students.

Only two of Animal Político’s more than 12 requests for government documents were fulfilled — “several pages of blurry PDFs” — but from there the start up was able to build several interactive visualizations “detailing turf wars and cartel-affiliated armed groups.”

How Brands Are Using Data to Drive Personalized Context In-Store

Some retail stores are leveraging the ubiquity of mobile devices and the availability of data to create personalized experiences for their customers in-store and in real time. As a result, potential buyers might get hyper-relevant recommendations, in-the-moment deals, and in general another layer of context around their shopping experience.

One store tracks what products customers bring into the fitting room and recommends complementary products in real-time or as a follow up. Another retailer allows customers to check in with their PayPal apps and browse via screen, after which an associate gathers the items and texts the customer once a fitting room is available. RFID tags on the merchandise provide up-to-date inventory, and smart mirrors in the fitting rooms allow customers to request other sizes or items.

IBM Works With The Weather Company To Track Earth’s Big Data Atmosphere

IBM and The Weather Company are teaming up to offer 100% accurate meteorological predictions three days ahead of time, based on the “massive” amounts of data The Weather Channel is already collecting.

The Weather Channel’s “data space” is so huge because it goes from “the surface of the Earth all the way around the globe up to the top of the atmosphere,” which is about 100 kilometers high. Later on, social media data points might also come into play, similar to the way the US Geological Survey National Earthquake Information Center is using earthquake-related tweets to improve quake detection.

Alien hunters turn to IBM and big data tool in hunt for ET

While The Weather Channel hunts the skies for weather, the Search for Extra-Terrestrial Intelligence (SETI) Institute is looking heavenward for signs of alien life, and using big data to do so.

SETI has access to a large amount of data gathered by the Allen Telescope Array and analyzes that data to find radio signals “that differ from background astrophysical and human signals.” As a result of listening to signals for four years, the institute has a robust database of signals identified as interference from “humans and non-alien source,” which they then can compare to signals that are out of the ordinary.

Baltimore could probably survive zombie apocalypse, data show

If a virus-based zombie apocalypse should ever hit, apparently one of the best places to be is Baltimore.

CareerBuilder and Economic Modeling Specialists Intl. analyzed various data points to rank cities in terms of zombie apocalypse survivability. The companies used eight criteria, including defendability against the zombie virus, means for containing the virus, likelihood of finding a cure, and a sufficient food supply to outlast the epidemic.

How did they measure such criteria? For a city’s ability to defend against the virus, they analyzed “the percentage of the area’s population in the military, law enforcement, firefighting and security,” and “percentage of total exports coming from small arms manufacturing industries.” For likelihood of curing the virus, percentage of bio-medical research and professionals was analyzed.

Only more likely than Baltimore to survive are Boston, Salt Lake City, and Columbus. The number one least likely? New York.

Data Tells a Story: Comcast; Penn Medicine; urban bike sharing

bicycles

We at Inform believe that data tells a story, across all industries, and every week we’ll be rounding up the most interesting ones right here. This week: Comcast; Penn Medicine; and urban bike sharing.

Comcast Seeks to Harness Trove of TV Data

The cable company has access to viewing data from 18 to 22 million subscribers across the much of the U.S., and is planning on licensing that data to other companies for a multitude of purposes.

Analyzing such data could help boost the TV-ad market; fill the gap in traditional TV ratings which omits data from mobile devices and on-demand and streaming services; and more narrowly target audience’s interests, whether the audience is consuming ads or programming.

Amadeus using data to help airlines tackle disruption

A technology company has partnered with an airline to develop new technology to help airlines better manage disruptions such as bad weather and air traffic congestion.

The technology uses a recommendation engine to analyze data drawn from sources including air traffic control, maintenance, and crew management systems, in order to help airlines make more efficient decisions on issues such as whether or not to delay or cancel a flight. Other possible applications include improving issues related to check-in time, airport gates, and luggage belts.

Big data for the rest of us: One UK insurance company’s success story

An insurance company in the UK is using big data to improve fraud detection and lower customer cancellation rates. As a result, they’ve been saving $7.5 million annually since implementing their data collection and analysis processes.

The company’s large volumes of data include 20 million insurance quotes a day; premiums based on these quotes; customer risk factors; and other customer information such as credit scores, identity checks, and fraud data. Processing such data has helped to identify previously unknown patterns and differences between customers; to highlight “fraud indicators earlier in the customer journey”; and to make pricing decisions based on customer behavior.

Penn Medicine’s big data system triggers early detection of life-threatening infections

Big data is helping Penn Medicine to innovate on clinical quality improvement, genomic research, diagnostic apps, and more. A team of clinicians and data scientists is using a huge volume of data “to build prototypes of new care pathways,” which are tested with patients. Those results are fed back into the algorithms “so that the computer can learn from its mistakes.”

A significant success the team has already achieved is the prediction of sepsis infections a full 24 hours earlier than before the algorithm was introduced.

Jurisdictions Gather and Use Data on Bicycles for Planning, Governance

As urban bike-sharing programs become more and more popular, some cities are taking advantage by letting their bikes gather valuable data. For instance, Portland, Oregon is harnessing the data collected by bike-counting sensors to back their increasing investment in biking culture, such as adding more “bike-friendly infrastructure” and boosting bike-share programs.

Chicago’s bicycles record basic demographic data from yearlong pass holders, as well as where they’re going and where they’ve been. The Windy City goes a little further by publishing the data and letting the public create visualizations based on it. One showed which was faster, biking or using public transit, depending on the route. Another displayed the amount of bike traffic in each neighborhood as well as those that needed more bike lanes, while another created a dating application specifically bikers.

Dublin fitted 30 of their bikes with air sensors to measure the city’s air quality. The sensors gather data on levels of carbon dioxide, carbon monoxide, smoke, and particulates, and from there the city can determine the cleanest routes for bikers as well as problem spots where air pollution needs to be improved.