Tag Archives: data mining

The Brain Wikipedia – Scientists Launch Open-Access Neuron Database

The human brain is one of the biggest and most intriguing mysteries scientists are tackling. It’s an incredibly active, bustling place that keeps us going and effectively makes us the people we are. There are about 100 billion neurons processing and transmitting information through electrical and chemical signals and to make things even more complicated, each of these neurons has about 10,000 different connections to neighboring brain cells. Needless to say, mapping and understanding all these neurons and connections is a gargantuan task – that’s why computer scientists and biologists from Carnegie Mellon University in the US have created an open-access database indexing all the known physiological information about neurons.

Image via CG Trader.

Basically, they’ve developed a wiki-like system called NeuroElectro (website here).

“The goal of the NeuroElectro Project is to extract information about the electrophysiological properties (e.g. resting membrane potentials and membrane time constants) of diverse neuron types from the existing literature and place it into a centralized database”.

I took a look at the website, and I have to say, it’s a fantastic achievement. The design is pleasant, information is easy to access (if you know what you’re looking for), and speaking of information – there’s a LOT of it. The roughly 300 types of neurons are arranged, discussed and presented in an almost exhaustive fashion.

The database was created by computational neuroscientist, Shreejoy J. Tripathy, from the University of British Columbia in Canada, who analyzed almost 10,000 published papers describing how neurons react to various inputs. He then used text-mining algorithms to ‘read’ each of the papers, retrieving information on how they function, how they respond, and how the data was gathered. The work isn’t finished, and until now, he “only” managed to characterize 100 types of neurons; the algorithms he used are also not perfect, so he had to complement it with a lot of manual checking and validation.

“If we want to think about building a brain or re-engineering the brain, we need to know what parts we’re working with,” said Nathan Urban, director of the Carnegie Mellon’s BrainHub neuroscience initiative, in a press release.

“We know a lot about neurons in some areas of the brain, but very little about neurons in others. To accelerate our understanding… we need to be able to easily determine whether what we already know about some neurons can be applied to others we know less about.”

The database, as well as the techniques used to create it (and future plans for improvement) are described in the Journal of NeurophysiologyIt’s a great achievement, and a great tool for many researchers working in the field.

“It’s a dynamic environment in which people can collect, refine and add data,” Urban said of the NeuroElecto database. “It will be a useful resource to people doing neuroscience research all over the world.“

Via Science Alert.

Twitter to release all tweets for scientists: massive scientific tool, but also an ethical dilemma

Five hundred million tweets are tweeted each day – with so many details about the location, interests and behaviors of users, the tweets are a trove of useful information for scientists who might be, for example, looking to find patterns in human behaviors, checking out risk factors for health conditions and track the spread of infectious diseases.

sssThere are many potential uses to this information. By analyzing emotional cues in pregnant women, Microsoft researchers developed an algorithm that predicts those at risk for postpartum depression. The United States Geological Survey uses Twitter to track the location of earthquakes as people tweet about tremors. I could go on for days.

However, while all tweets are public, researchers wanting to access them have to do it through Twitter’s application programming interface, which currently only looks through 1 percent of the archive – drastically limiting the amount of available data. But all that is about to change.

Twitter announced that in February 2015, they will make all their tweets dating back to 2006 available for scientific research – with everything up for grabs, the usage of Twitter as a research tool will likely skyrocket. With so many data points to mine, it’s almost impossible to think of all the potential applications.

But this also raises some tough ethical questions: will Twitter claim any legal rights to any scientific findings? It seems somewhat understandable, and they could make a very strong case. But the most important question is: is it ethical to use the data of the people, without them giving consent? Again, on one hand, it’s very valuable data, and scientists could make good use of it, ultimately providing benefits to mankind. But on the other hand, maybe I just don’t want to reveal my data – I don’t feel comfortable with it. How could this be solved?

Caitlin Rivers and Bryan Lewis, computational epidemiologists at Virginia Tech, published guidelines for the ethical use of Twitter data in February. It seems like common sense, but I guess it needed to be written down. The gist of it is: never reveal personal information about users. Username, location, personal preference, whatever – that’s private, and you shouldn’t reveal personal information, just statistical information. Rivers and Lewis argue that it is crucial for scientists to consider and protect users’ privacy as Twitter-based research projects multiply. Well, as Spiderman said, with great data comes great responsibility! Or was that Snowden?