Tag Archives: Deep

Scuba Diver.

Robots and AI can help us better understand deep sea species, study reports.

Robots and artificial intelligence may be just what we need to meet the denizens of the ocean floor, a new study reports.

Scuba Diver.

Image via Pixabay.

Artificial intelligence (AI) has an important role to play in helping us understand the large variety of species living on the ocean floor, new research from the University of Plymouth reports. Such systems could finally allow marine researchers to push past the efficiency bottleneck created by human users analyzing recordings from the depths of the sea.

Davy Jones’ locker

“Autonomous vehicles are a vital tool for surveying large areas of the seabed deeper than 60m [the depth most divers can reach],” says PhD student Nils Piechaud, lead author on the study. “But we are currently not able to manually analyse more than a fraction of that data.”

“This research shows AI is a promising tool but our AI classifier would still be wrong one out of five times, if it was used to identify animals in our images.”

The new study analyzed the effectiveness of a computer vision (CV) system in taking over the role of humans in analyzing deep-sea images. All in all, the team found, such as system is around 80% accurate in identifying various animals in images of the seabed but can be up to 93% accurate for specific species if enough data is used to train the algorithm. The authors say that such results suggest CV could soon be routinely employed to study marine animals and plants. In such a case, it would lead to a major increase in data availability for conservation research and biodiversity management, they add.

“But we are not at the point of considering it a suitable complete replacement for humans at this stage,” Piechaud notes.

The team used Google’s Tensorflow, an open access library, to teach a (pre-trained) neural network to identify individuals of deep-sea species found in images taken by autonomous underwater vehicles (AUV). One of these AUVs, known as Autosub6000, was deployed back in May 2016 on the north-east side of Rockall Bank, UK, and collected over 150,000 images in a single dive. Around 1,200 of these images were manually analyzed, containing 40,000 individuals of 110 different kinds of animals (morphospecies), most of them only seen a handful of times.

Manual annotation ranged from 50 to 95% on this dataset; however, it was very slow. And, as you guessed from that ‘ranged’ part, it was quite inconsistent across different teams and work intervals. The automated method reached around 80% accuracy, approaching the performance of humans with a clear speed and consistency advantage. The software worked particularly well for certain morphospecies. For example, it correctly identified a type of xenophyophore 93% of the time.

So should we just use it instead of marine biologists? Well, the authors of this present study don’t think that would be a good idea. The study makes a case for automated systems working in tandem with marine biologists, not replacing them. The AIs could greatly enhance the ability of scientists to analyze the data before them.

And combining the ability of high-tech AUVs to survey large areas of the seabed, the fast data-crunching ability of AI, and expertise of marine biologists together could massively speed up the rate of deep-ocean exploration — and with it our wider understanding of marine ecosystems.

“Most of our planet is deep sea, a vast area in which we have equally large knowledge gaps,” says Dr Kerry Howell, Associate Professor in Marine Ecology and Principal Investigator for the Deep Links project.”

“With increasing pressures on the marine environment including climate change, it is imperative that we understand our oceans and the habitats and species found within them. In the age of robotic and autonomous vehicles, big data, and global open research, the development of AI tools with the potential to help speed up our acquisition of knowledge is an exciting and much needed advance.”

The paper “Automated identification of benthic epifauna with computer vision” has been published in the journal Marine Ecology Progress Series.

Teapot golfball.

Artificial intelligence still has severe limitations in recognizing what it’s seeing

Artificial intelligence won’t take over the world any time soon, a new study suggests — it can’t even “see” properly. Yet.

Teapot golfball.

Teapot with golf ball pattern used in the study.
Image credits: Nicholas Baker et al / PLOS Computational Biology.

Computer networks that draw on deep learning algorithms (often referred to as AI) have made huge strides in recent years. So much so that there is a lot of anxiety (or enthusiasm, depending on which side of the contract you find yourself) that these networks will take over human jobs and other tasks that computers simply couldn’t perform up to now.

Recent work at the University of California Los Angeles (UCLA), however, shows that such systems are still in their infancy. A team of UCLA cognitive psychologists showed that these networks identify objects in a fundamentally different manner from human brains — and that they are very easy to dupe.

Binary-tinted glasses

“The machines have severe limitations that we need to understand,” said Philip Kellman, a UCLA distinguished professor of psychology and a senior author of the study. “We’re saying, ‘Wait, not so fast.”

The team explored how machine learning networks see the world in a series of five experiments. Keep in mind that the team wasn’t trying to fool the networks — they were working to understand how they identify objects, and if it’s similar to how the human brain does it.

For the first one, they worked with a deep learning network called VGG-19. It’s considered one of the (if not the) best networks currently developed for image analysis and recognition. The team showed VGG-19 altered, color images of animals and objects. One image showed the surface of a golf ball displayed on the contour of a teapot, for example. Others showed a camel with zebra stripes or the pattern of a blue and red argyle sock on an elephant. The network was asked what it thought the picture most likely showed in the form of a ranking (with the top choice being most likely, the second one less likely, and so on).

Combined images.

Examples of the images used during this step.
Image credits Nicholas Baker et al., 2018, PLOS Computational Biology.

VGG-19, the team reports, listed the correct item as its first choice for only 5 out of the 40 images it was shown during this experiment (12.5% success rate). It was also interesting to see just how well the team managed to deceive the network. VGG-19 listed a 0% chance that the argyled elephant was an elephant, for example, and only a 0.41% chance that the teapot was a teapot. Its first choice for the teapot image was a golf ball, the team reports.

Kellman says he isn’t surprised that the network suggested a golf ball — calling it “absolutely reasonable” — but was surprised to see that the teapot didn’t even make the list. Overall, the results of this step hinted that such networks draw on the texture of an object much more than its shape, says lead author Nicholas Baker, a UCLA psychology graduate student. The team decided to explore this idea further.

Missing the forest for the trees

For the second experiment, the team showed images of glass figurines to VGG-19 and a second deep learning network called AlexNet. Both networks were trained to recognize objects using a database called ImageNet. While VGG-19 performed better than AlexNet, they were still both pretty terrible. Neither network could correctly identify the figurines as their first choice: an elephant figurine, for example, was ranked with almost a 0% chance of being an elephant by both networks. On average, AlexNet ranked the correct answer 328th out of 1,000 choices.

Glass figurines.

Well, they’re definitely glass figurines to you and me. Not so obvious to AI.
Image credits Nicholas Baker et al / PLOS Computational Biology.

In this experiment, too, the networks’ first choices were pretty puzzling: VGG-19, for example, chose “website” for a goose figure and “can opener” for a polar bear.

“The machines make very different errors from humans,” said co-author Hongjing Lu, a UCLA professor of psychology. “Their learning mechanisms are much less sophisticated than the human mind.”

“We can fool these artificial systems pretty easily.”

For the third and fourth experiment, the team focused on contours. First, they showed the networks 40 drawings outlined in black, with the images in white. Again, the machine did a pretty poor job of identifying common items (such as bananas or butterflies). In the fourth experiment, the researchers showed both networks 40 images, this time in solid black. Here, the networks did somewhat better — they listed the correct object among their top five choices around 50% of the time. They identified some items with good confidence (99.99% chance for an abacus and 61% chance for a cannon from VGG-19, for example) while they simply dropped the ball on others (both networks listed a white hammer outlined in black for under 1% chance of being a hammer).

Still, it’s undeniable that both algorithms performed better during this step than any other before them. Kellman says this is likely because the images here lacked “internal contours” — edges that confuse the programs.

Throwing a wrench in

Now, in experiment five, the team actually tried to throw the machine off their game as much as possible. They worked with six images that VGG-16 identified correctly in the previous steps, scrambling them to make them harder to recognize while preserving some pieces of the objects shown. They also employed a group of ten UCLA undergrads as a control group.

The students were shown objects in black silhouettes — some scrambled to be difficult to recognize and some unscrambled, some objects for just one second, and some for as long as the students wanted to view them. Students correctly identified 92% of the unscrambled objects and 23% of the scrambled ones when allowed a single second to view them. When the students could see the silhouettes for as long as they wanted, they correctly identified 97% of the unscrambled objects and 37% of the scrambled objects.

Silhouette and scrambled bear.

Example of a silhouette (a) and scrambled image (b) of a bear.
Image credits Nicholas Baker et al / PLOS Computational Biology.

VGG-19 correctly identified five of these six images (and was quite close on the sixth, too, the team writes). The team says humans probably had more trouble identifying the images than the machine because we observe the entire object when trying to determine what we’re seeing. Artificial intelligence, in contrast, works by identifying fragments.

“This study shows these systems get the right answer in the images they were trained on without considering shape,” Kellman said. “For humans, overall shape is primary for object recognition, and identifying images by overall shape doesn’t seem to be in these deep learning systems at all.”

The results suggest that right now, AI (as we know and program it) is simply too immature to actually face the real world. It’s easily duped, and it works differently than us — so it’s hard to intuit how it will behave. Still, understanding how such networks ‘see’ the world around them would be very helpful as we move forward with them, the team explains. If we know their weaknesses, we know where we need to put most work in to make meaningful strides.

The paper “Deep convolutional networks do not classify based on global object shape” has been published in the journal PLOS Computational Biology.

Google’s Neural Machine can translate nearly as well as a human

A new translation system unveiled by Google, the Neural Machine Translation (GNMT) framework comes close to human translators in it’s proficiency.

Public domain image.

Not knowing the local language can be hell — but Google’s new translation software might prove to be the bilingual travel partner you’re always wanted. A recently released paper notes that Google’s Neural Machine Translation system (GNMT) reduces translation errors by an average of 60% compared to the familiar phrase-based approach. The framework is based on unsupervised deep learning technology.

Deep learning simulates the way our brains form connections and process information inside a computer. Virtual neurons are mapped out by a program, and the connections between them receive a numerical value, a “weight”. The weight determines how each of these virtual neurons treats data imputed to it — low-weight neurons recognize the basic features of data, which they feed to the heavier neurons for further processing, and so on. The end goal is to create a software that can learn to recognize patterns in data and respond to each one accordingly.

Programmers train these frameworks by feeding them data, such as digitized images or sound waves. They rely on big sets of training data and powerful computers to work effectively, which are becoming increasingly available. Deep learning has proven its worth in image and speech recognition in the past, and adapting it to translation seems like the logical next step.

And it works like a charm

GNMT draws on 16 processors to transform words into a value called “vector.” This represents how closely it relates to other words in its training database — 2.5 billion sentence pairs for English and French, and 500 million for English and Chinese. “Leaf” is more related to “tree” than to “car”, for example, and the name “George Washington” is more related to “Roosevelt” than to “Himalaya”, for example. Using the vectors of the input words, the system chooses a list of possible translations, ranked based on their probability of occurrence. Cross-checking helps improve overall accuracy.

The increased accuracy in translation happened because Google let their neural network do without much of the previous supervision from programmers. They fed the initial data, but let the computer take over from there, training itself. This approach is called unsupervised learning, and has proven to be more efficient than previous supervised learning techniques, where humans held a large measure of control on the learning process.

In a series of tests pitting the system against human translators, it came close to matching their fluency for some languages. Bilingually fluent people rated the system between 64 and 87 percent better than the previous one. While some things still slip through GNMT’s fingers, such as slang or colloquialisms, those are some solid results.

Google is already using the new system for Chinese to English translation, and plans to completely replace it’s current translation software with GNMT.