A Q&A with Sara Melvin, Data Scientist and UCLA Graduate Student
Will humans be needed in the future or will machines replace them?
It doesn’t seem plausible that machines will replace every human job. However, machines have been slowly replacing human workers since the Industrial Revolution, thus it’s inevitable that in the future more human jobs will be replaced by machines. Why pay for a person (who needs breaks, needs to sleep, can cause accidents, and has a union representative) to transport products around? Why wouldn’t Uber replace their drivers with self-driving cars in the future?
However, to some degree machines will need humans. Machine learning is 99% human work for many reasons, such as defining target concepts that are not well-defined: “Ill-defined problems present a dilemma for planning: how can one plan the route towards a solution if one knows so little about the path ahead, especially when one does not know the final destination or goal state.” An example of this is shown with the the nine-dot puzzle where the task is to connect the dots with four straight lines without lifting your pen. This problem lacks a visualizable goal where if the limits of a problem state where it is assumed that your pen can only stay within the bounds of the nine dots, you will never reach a solution. There are just some problems out there that need out-of-the-box thinkers who creatively give solutions never tried before.
How do humans and machines work together?
Currently, humans and machines work together in several different ways. Over time, humans work with machines as tools to make their labor easier (e.g., microwave, bulldozer, power tools, etc.). Now humans use machines to suggest the best route to a destination given the traffic conditions, manage their house budget, detect fraud on their credit cards, automatically tag Facebook friends in vacation photos, recommend products or movies they want to see, and so on. All of these use cases come from a human programmer who created and trained an algorithm for a machine to know what to search for, recommend, detect, and categorize.
Professionals work with machines to design, search, detect, categorize, network, and communicate. In the Scalable Analytics Institute at UCLA, I use machines in research where I run algorithms to sort through, filter, organize, cluster, and then classify phrases in millions of tweets to potentially detect events. On my own this would take years, but with just my personal computer I am able to do this work in an hour or less. Production engineers use machines to make other machines with precision; graphic designers use computers to design beautiful brochures, websites, logos, and business cards; and marketers use computers to keep track of listings, target new clients, and advertise new listings. The list goes on and on.
In the future, however, it is possible that the work shared between humans and machines will evolve. It may not be in the too far distant future that humans will be teaching machines general, repetitive tasks (e.g., Baxter the robot).
Is it possible for a machine to learn to do anything that people can do?
This is a controversial subject. Is everything that a person is able to do a learnable task? I mentioned before that problems are ill-defined, where the goal and path to reach the goal is not well defined. How can a computer suggest an idea outside that which a human has defined or taught it? Creative bots like Emily Howell use deep learning, which needs a well-defined input and outcome. Emmy had to be taught by David Cope to play piano pieces, but would Emmy come up with a prepared piano piece without being taught that it was even possible?
Please give examples of how data science will impact people’s lives in the future.
Data science has already impacted people’s lives in areas such as healthcare, business, and social networking. How will data science impact people’s lives in the future? We will have to find out. Healthcare could be personalized by genome, disease outbreaks could be contained in a shorter amount of time, images and videos could be semantically searched, education could be personalized by learning style.
How did Google’s AlphaGo win its Go match against a human?
The game of Go has more moves than chess and is a well-defined problem. Because of this, AlphaGo used singular extensions to bound the search space to overcome the horizon effect; it also used reinforcement learning to keep from learning bad moves that happened to lead to ultimately good outcomes. Given the problem is well defined, an evaluation model could be defined to quantify “good” versus “bad” moves. Finally, AlphaGo played itself for years, which allowed it to continuously learn the best possible moves.
How does natural language processing differ on Twitter versus other social media versus other types of text?
Natural language processing (NLP) differs in its challenges depending on the platform.
By design, Twitter is an informal platform that restricts communication to messages of 140 characters. This means many users misspell and abbreviate words to save on characters. In addition, these misspellings and abbreviations aren’t always standardized, thus creating a more difficult challenge for NLP. Because of the informal nature of Twitter, NLP cannot use tree-structure or recursive learning algorithms that rely on correct English sentence structure. In addition, it is also challenging to find the semantic meaning of vague, abbreviated, or misspelled words.
Other social media platforms do not have the 140-character limit, so words are usually spelled out in full. However, this platform is still informal, thus words can be used incorrectly, words can be misspelled, slang is used, and the rules of English sentence structure do not always apply. Therefore, tree-structure and recursive algorithms also do not work in these environments. However, the fact that most words are spelled out mitigates the semantic problems of parsing the text.
Other types of text will often follow the rules of English grammar and contain non-abbreviated, correctly spelled words. This environment is the least challenging of the three, where learning algorithms that rely on proper English sentence structure have a chance to be effective.