The human brain is referred to as the ‘crowning glory of evolution.’ It contains 86 billion brain cells known as neurons. Each neuron has an average of 7,000 connections to other neurons known as synapses. Our brains form millions of new synapses every second – resulting in constantly changing patterns and strength. These connections store experiences, create habits and shape personalities.
At the India edition of Maker Faire 2018 organized in Hyderabad earlier this month, volunteers from Microsoft Garage India collaborated with members of the external Maker Community to demonstrate the functioning of the human brain. Their project, the Brain Installation, showcased the power of art and technology. The objective of the structure was to simulate the anatomy of the human brain in a simple and comprehensible manner for the audience.
The anatomy of the human brain
The human brain controls every movement, thought and emotion. Given that the brain is a highly complex structure, it comprises of many intricate sub-structures and parts. In this phase of the Brain Installation project, which was incidentally also the first part, our focus was chiefly on the cerebral cortex. The cerebral cortex, which is a part of the cerebrum, is the outermost part of the brain. It is also the part responsible for distinct human traits such as memory, rational thinking and imagination as well language and consciousness. In effect, the cerebral cortex makes us superior to other species. This part is segmented into four lobes which are associated with a wide spectrum of functions ranging from reasoning to auditory perception.
Processing of auditory stimuli, memory, speech perception and understanding the mind of others
Processing of visual stimuli
Reasoning, planning, motor skills, problem solving, higher level cognition
Movement, orientation, perception of stimuli (touch, pain, pressure etc.) and recognition
Most brain functions are processed in multiple lobes since they work in conjunction with each other. They are strongly connected through a complex network of neurons. For instance, the process of higher-level decision making – unique to human beings – utilizes the frontal and prefrontal lobes. Both these regions are responsible for intelligent thought, complex calculations and cognition.
The brain installation
To build the brain model, volunteers from Microsoft Garage studied a 3D model of the brain – slicing it into different parts and lobes. They then used a software to create a 2D drawing of all the slices. This understanding was applied to the final structure – the brain installation.
The brain installation was an acrylic structure weighing about 100 kg. The overall size of the structure was around 2.5 feet in length and height. It measured about 2 feet in width. Different slices of acrylic sheets, each measuring about 6 mm thick represented different parts of the brain. Around 25 slices were assembled longitudinally whereas 13 slices were stacked horizontally. The structure was supported by a metallic base and was propped up at a height of 7 feet.
Further, around 350 LED lamps were used to light up different sections of the brain structure. Each color represented a different part.
How did the brain installation work?
The brain installation was an amalgamation of various technologies and software. Given below is the detailed process of its functioning:
Volunteers wore an electroencephalography (EEG) This is a brainwave sensing instrument that monitors and records the electrophysiological activity of the brain. In this case, our volunteers used the Neurosky MindWave Mobile headset, which had electrodes placed along the scalp of the volunteer.
The sensors in the headset could detect, measure and record EEG power spectrums (brain waves such as alpha waves, beta waves, gamma waves etc) data. The data could also be exported to third-party applications for downstream data analysis and processing.
The data gathered by the EEG headset was deployed by a C# program developed by Garage volunteers. The C# program was designed using a machine learning model to map the emitted brain waves to the respective lobes.
The output from the C# program was captured in a microcontroller (Arduino Mega) which in turn would send signals to the PCB (designed and manufactured by the Garage Team).
The PCBs were designed to send signals to the brain installation and light up parts which were activated.
When a volunteer donned the EEG headset, he would see the structure light up to reflect the exact portion of his brain activated at that movement. Further, the lighting would change to showcase different parts of his brain being stimulated when the volunteer changed his activity.
For instance, if a volunteer were to be connected to the brain installation while reading this article, the Broca’s area (which is responsible for language processing) and the hippocampus (which controls the curiosity trait) would be illuminated. Likewise, when you are amid a task that requires high concentration, the structure would indicate that the prefrontal and parietal cortex of your brain are at work.
Taking the project forward
Commenting on the success of the project, Reena Dayal Yadav, Director, Microsoft Garage – India said, “The human brain is most fascinating and we have so much to learn from how the brain functions. The brain installation provided a rare experience by demonstrating some of the activities and the corresponding part of the brain. The model piqued the curiosity of members of all age groups and was widely appreciated by the audience.”
The volunteers at Microsoft Garage are currently working towards increasing the accuracy of the brain installation.
eSport or competitive video gaming brings together millions of players and viewers from across the world. Having witnessed a steep increase in popularity over the last few years, industry estimates forecast the viewer base to increase from 230 million in 2015 to 427 million in 2019.
With the growing fanbase, the viewing arena has also expanded from television sets at homes to gigantic screens in stadiums. Professional teams participate in multiplayer competitions which often have prizes worth millions of dollars.
Following on the success of predicting cinematic awards, the soccer World Cup, and various big-ticket sporting events, Bing is now predicting major eSport tournaments. Bing’s first foray into eSports was in 2016 and has since been expanding to cover major tournaments catering to massively multiplayer online role-playing games (MMORPGs) such as Dota2 and League of Legends.
Currently, Bing is predicting the most followed eSports event, the League of Legends World Championship, also known as ‘The Worlds’. With the finals of the 2018 World Championship being held on November 3, 2018, in South Korea between Fnatic and Invictus Gaming, Bing predicts Invictus Gaming to take home the prize money of $823,250. Last year, the finals of the tournament were watched by 60 million people, making it one of the world’s most-watched matches.
Predictability in eSports
Building prediction models for eSports is different from how we predict outcomes of leading global sporting events. Compared to most sporting events, eSports has richer data. However, the element of unpredictability is much higher for eSports primarily due to two factors. First, frequent roster changes make it difficult to evaluate current team performance. Second, the selection phase plays a vital role in the outcome of the match as each player gets to choose from a myriad of characters, resulting in various team combinations that are difficult to predict beforehand. The actions of a handful of players can affect the outcome of the tournament. The uncertainty makes forecasting sporting events more challenging and fun.
The building blocks of the predictive model
We use a combination of long-term signals (such as matchmaking rating) and short-term spikes to balance between teams that have a historically good record as opposed to teams that are currently performing well. Moreover, we account for team-level and player-level statistics.
To counter the constant change in line-up, Bing Predicts includes a model that tracks team performance as well as the career performance of individual players based on in-game features such as gold per minute (GPM) and experience per minute (XPM) along with external features such as win rates. Further, our model computes changes in key statistics for each player and each team after a roster shuffle to quickly update predictions. However, accommodating the same was difficult for the finals of Dota2 – ‘The International 2018’ as complexities increased because of roster changes in the Evil Geniuses’ team and OG’s new line-up.
Another factor that plays an important role in deciding the outcome of games like Dota 2 is its drafting phase. Further, we computed a relative measure of a team’s strength to its peers. This factor helped in determining the overall win for the model.
These lessons will help us improve the prediction competencies of our models for the upcoming tournaments.
Just as in real-world sport, the human spirit plays an important role in determining the winners of virtual games. At Bing Predicts, our aim is to make our model stronger and better with every game and add to the excitement.
“Do Bigha Zameen movie lo protagonist evaru?” Indian readers who speak any combination of Hindi, English, or Tamil, could translate this question with ease - “Who is the protagonist of the movie Do Bigha Zameen?”
The ability to communicate using complex and nuanced languages is one of the most special facets of being human. It is a unique skill that sets us apart as a species and weaves our communities together. However, in multilingual parts of the world, communities are held together by more than just one language.
In India speakers rarely communicate in a single language. Hindi speakers could borrow English words while English speakers could use Tamil phrases to fill the gaps and convey their message more clearly. This form of code-mixing is both common and effortless for people who live in multilingual communities. In fact, the tendency to mix code spills over into digital and social media platforms.
Code-mixing may be natural for multilingual people, but it presents a challenge for current artificial intelligence technologies. Artificial Intelligence (AI)-enabled assistants that are programmed to detect, interpret, and generative responses in natural language are programmed in a single language. The inability to work with several different languages simultaneously makes working with these AI-enabled tools less convenient for multilingual users. To make machines sound more human, this ability to interpret and output mixed code is essential.
The challenge of code-mixing
Teaching machines to splice together multiple languages is challenging. The underlying framework of any AI-based model is created using a single language, mostly English. Code mixed phrases and sentences don’t have a clearly defined structure, syntax, or a universally accepted phonetic protocol. These sentences are more free-flowing and casual, relying on the common heritage of the two speakers. Recreating this through algorithms is a challenge.
Moreover, for code-mixed languages, what further compounds the problem, is the severe paucity of annotated sentences and other important language analysis tools, such as parts of speech taggers, sentence parsers which are required for training AI algorithms to comprehend and understand the semantic meaning.
Solving the lack of appropriate code-mixed data
In collaboration with researchers Khyati Raghavi, Prof. Alan Black from Carnegie Mellon University (CMU), Pittsburg, USA and Prof. Manish Shrivastava from IIIT Hyderabad, we decided to build a QA system for code-mixed languages with minimal language resources. The essence of this approach lies in performing a simple word level translation of the given question into English, which is a resource rich language, and then leveraging the resources available in English for better understanding the intent. Of course, we were aware that word level translations may not be accurate but we still decided to pursue this idea since this might be a good starting point and may also seed the process of collecting some real user question data in code-mixed language which is useful in further tuning and improving our system.
Working through this idea, we built an end-to-end web-based Factoid QA system for code mixed languages called WebShodh. WebShodh employs machine learning and deep learning techniques to understand the intent of the question and type of the expected answer which has been expressed in code-mixed language. It currently supports Hinglish and Tenglish translations of questions and is hosted at http://tts.speech.cs.cmu.edu/webshodh/cmqa.php.
Given a code-mixed question as an input, WebShodh translates all the words into English and runs the translated query through four layers of processing to generate an appropriate response:
Question Classification: WebShodh applies a Sector Vector Machine (SVM) based QuestionClassiﬁer (QC) which classiﬁes the code-mixed question into one of the given types such as – human, location, entity, abbreviation, description and numeric. Going further, the model runs pre-trained vectors from web news for every word, an analysis of words on either side of ‘wh-’ questions to detect word2vec embedding, and modified parameters in SVM. All these changes lead to a boost in classification accuracy from 63% to 71.96%.
Web Search: The translated queries are also used as an input in a web search API. The titles, snippets, and URLs of the top ten results are then used as fodder for the answer generation process.
Answer Generation: To generate candidates for answers, the model runs the titles and snippets generated from the search API through POS tagging, Chunking and Named Entity Recognizer tools. This helps the model narrow down the list of potential answers into a shorter list of most appropriate answers for a given query.
Answer Ranking: In order to offer only the most relevant answers, the model runs the final list of answer candidates through a ranking process. Each potential response is assigned a relevance score based on its similarity to the translated code-mixed question and all collected search data where the candidate answer occurs.
The WebShodh model was evaluated using 100 questions from Hinglish (a blend of Hindi and English) and Tenglish (a blend of Telugu and English) each. Each dataset was two-thirds native language and one-third English. The evaluation questions were collected from 10 native Hindi and Telugu bi-lingual speakers. The model achieved a Mean Reciprocal Rank (MRR) of 0.37 for Hinglish and 0.32 for Tenglish, which translates to the fact that most of the times, it is able to find the correct answer within the top 3 results.
Creating multilingual digital assistants
Today’s multilingual societies require software tools which support interaction in CM languages. This increases the reach and impact of many product features which are limited today to the users of a few resource-rich languages. WebShodh is a testament that despite severe constraint on resources, AI based systems could still be built for CM languages. WebShodh currently uses very few resources such as bi-lingual dictionaries for the supported languages. Through more online user interactions, WebShodh has the potential to collect more user data using which the system could be re-trained to further boost accuracy of results.
As part of future work, we are contemplating how we can extend this work to support multi-turn and casual conversations. This could pave the way for AI-enabled chatbots and virtual assistants who can interact with human users in a more natural manner. This technology could substantially improve the digital experience of multilingual users who instinctively code-mix their communication. By eliminating the language barrier, these improved models could unlock the potential of AI for millions of users and hundreds of multilingual communities across the world.
Today, professional teams and organizations across the world use Project Online, Microsoft’s project management platform, to streamline team processes, schedule tasks efficiently and manage resources and workflows for team projects. The platform stands out due to its highly-visual and intuitive interface.
However, many of the visual and design elements of Project Online have been inaccessible for users with visual impairment, deafness or motor disabilities. By incorporating accessibility considerations in the Project Online interface, we aim to address barriers for users with varying disabilities. Our team has been working on adding new and unique features to help users with disabilities leverage this online platform for better project outcomes.
Here’s how we made Project Online more accessible:
Enhancing accessibility for a wide set of users is a key challenge considering the unique design and visual features of Project Online. Project Online is used by a wide range of users who rely on different features and tools to get the most out of this productivity tool. With the platform serving large teams that could be working and collaborating remotely, upgrades to the platform need to be measured and well-planned. Legacy screen elements need to be modified to work with nearly every screen reader and browser combination. Accessibility features need to consider varying degrees of disabilities and different user preferences. Features need to be customizable to help users with a diverse set of accessibility needs to make the most of their online productivity suite.
Screen reader enhancements
Users with visual impairments use a screen reader to detect and work with on-screen elements. These screen readers rely on Web Accessibility Initiative – Accessible Rich Internet Applications (WAI-ARIA) tags to help users identify and interact with elements.
Our efforts to enhance accessibility for Project Online were focused on ARIA tags. Although ARIA tags have been helpful for users with visual impairments, compatibility issues and legacy screen readers have led to users missing out on key elements and features of Project Online. We rectified older ARIA tags and deployed software improvements to help screen readers work better with these tags. The result is that the latest version can differentiate between links, buttons, and other on-screen elements.
Screen reader highlighting the table outline for better navigability
While working with tables, the ARIA tags can help users navigate not only the whole table and select individual cells, but additionally read the titles and text inserted in each cell. The ability to highlight the whole table outline and edit specific cells improves navigability for users with visual impairments.
Upgraded ARIA tags also enable users to clearly identify images, graphs, and visual elements. Screen readers can now leverage dynamic swapper technology to announce a change or update to any cell in real-time and let users know if a mandatory field in a form hasn’t been completed. To address contrast issues for people with low or impaired vision, we changed the colors to adhere to 4.5:1 for text and background or foreground colors. This solves the issue of users being unable to interact or identify information in tables, images, or forms due to the limitations of legacy screen reader software.
Color contrast between text and the background
These key enhancements for screen readers solve a wide range of usability issues for people with visual impairments. With upgrades and additions to keyboard shortcuts we have improved accessibility for users with physical disabilities, deafness, and low mobility.
Keyboard shortcuts for better productivity
Quick and easy keyboard shortcuts reduce the range of motion required to interact with the Project Online platform. Although a limited range of keyboard shortcuts have always been part of Project Online, the expansion of these shortcuts has enhanced the platform’s accessibility. A simple keystroke can now help users swap between URLs and the main content on a screen. This ensures users navigate the web page or move to a new web page when required without having to move a mouse or touch a screen.
Keyboard shortcuts help users swap between different elements on the screen. The Project Online dashboard is usually populated with intensive flowcharts, diagrams, tables, and graphics. By setting shortcuts, users with physical disabilities can easily navigate a whole table or data set to interact with specific elements on any page. They can apply the changes automatically across different pages of the same work portfolio. For users with visual impairments, changes were made so that the focus is on the foreground window on the screen. This enables the screen reader to announce the elements that pop up in the foreground, making it easier for users to navigate based on priority.
Keyboard shortcuts reduce the amount of effort required to interact with Project Online. With a few simple clicks users with disabilities are able to navigate the interface and reduce the time it takes to edit their project management workflows.
At Microsoft, we adopt a holistic design approach to address accessibility challenges. Having upgraded legacy elements for better compatibility with the latest screen readers, Project Online can now handle dynamic elements better.
The assistive technology interacts with browser controls and document object models (DOM) to augment the regular Project Online view. The features leverage the new and improved capabilities of screen readers and the ease-of-use of keyboard shortcuts to deliver better accessibility.
Better collaboration with better accessibility
From planning and scheduling tasks to managing resources and analyzing reports, project management encompasses a number of responsibilities. Highly intuitive and visual tools like Project Online have helped project managers and teams of all sizes collaborate and get work done efficiently.
Our work to enhance the functionality of screen readers, keyboard shortcuts, and screen contrast settings is aimed at improving accessibility of the project and team management platform further.
Enhancing accessibility is a key element of our mission to empower every person and every organization on the planet to achieve more. At Microsoft, our commitment to accessibility extends across our product spectrum as we endeavor to deliver equivalent experiences to people with disabilities. We’re committed to investing more and more efforts and resources in this direction to enable the shared goal of wider accessibility and inclusion.
Work done in collaboration with Microsoft Research Redmond
A picture is worth a thousand words, at least to human beings. Machines often struggle to interpret and respond to images the way humans do.
In recent years, Artificial Intelligence (AI)-powered algorithms have combined image recognition with Natural Language Processing (NLP) to caption images presented to them by users. However, these are basic responses with literal descriptions of images and lack the depth or empathy found in human conversations.
With the growing adoption of AI agents and the ubiquitous use of images in communication, it is now essential for machines to interpret and respond to images naturally. To bridge this gap in communication, our team developed a new model for generating natural, human-like comments to images. Integrated with our desi Artificial Intelligence (AI)-based chatbot Ruuh, the model helps her respond to images like a human and hold a free-flowing conversation.
Essentially, this technology can help unlock the potential of AI-enabled assistive tools and facilitate increased user engagement by adding an emotional dimension to image comments. Images across the internet can be made more accessible by providing an emotion-aware description for alternative text (ALT text). Developers can leverage this new technology to create video games that provide players with witty observations on their gameplay, kiosks that provide users comments on their images and artificially-generated cricket commentary. Incorporating image commenting with emotional depth in AI-led interactions could thus add a whole new dimension to user experiences.
The challenge of emotion-aware image commenting
Caption generation is a core element of the AI image(video)-to-text domain. Much of the research in this field has focused on enabling machines to detect and characterize objects in images. Existing deep learning-based image captioning methods extract visual features and recognizable objects from an image and use a language model to create basic sentences or captions for the image. Applying a Recurrent Neural Network (RNN) to these existing models can enable a machine to interpret a series of images and generate a story from them.
However, the existing models do not go any deeper. They describe the objects in the image, any numbers or text, and even recognize human faces or animals. They cannot create a sentence that evokes emotions or differentiate between positive or negative experiences captured by the image.
There have been some attempts towards this direction in the past, like StyleNet (stylized captions), SentiCap (captions with sentiments), VQG (Visual Question Generation), etc. In this work, we extended these models to be able to generate human-like questions or comments based on the style and emotion detected.
The Image Commenting model is a benchmark for human-like comments on images. The comments go beyond descriptive machine-generated responses to express opinions, sentiments and emotions. The objective is to capture the user’s attention and drive engagement in a machine-generated conversation.
How the Image Commenting model works
The datasets for most studies of this nature involve human annotators who apply captions to images. However, such a controlled data generation environment was unsuitable for our model. To collect natural responses to images we extracted more than one million anonymized image-comment pairs from the internet. These pairs were filtered for sensitive material, political statements and adult content. The data was further processed to standardize the content - remove capitalizations, abbreviations and special characters to arrive at the final dataset.
Comparing Image Commenting data to the traditional Microsoft COCO data was a crucial step in ensuring the data was as natural as possible. Our analysis revealed that Image Commenting data was more sentimental and emotional, while the COCO data was more factual. The top word in the Image Commenting dataset was “like” whereas the top word in the COCO set was “sitting”. In fact, many of the most frequently used words in the Image Commenting data were sentimental, such as “love”, “great” and “pretty”. The variation in the length of sentences was more in the Image Commenting dataset, implying that these sentences were less structured and more natural. The conclusion was that Image Commenting data was far more expressive and emotional than COCO.
Figure 1. (a) Word cloud of the top words in the COCO dataset, (b) Word cloud of top words in the Image Commenting dataset.
Architecture: The first component is image featurization, where we use ResNet for creating vector representation of images. We use this feature representation along with information from Microsoft Vision API for face recognition, celebrity recognition etc. to extract candidate set of comments from the image-comment index. In the last stage, we used Deep Structured Semantic Model (DSSM) model trained on our dataset for ranking the candidate comments.
Figure 2. Architecture of the Image Commenting model
Figure 3. Examples of Image Commenting
Building emotional experiences and engagement with Image Commenting
With Image Commenting, machines can generate comments on images that are not just factual, but also emotive and socially relevant. The key aspect of Image Commenting is that the social language captured in the dataset is critical for making machines converse in a human-like manner.
Integrating the Image Commenting model with commercial conversational systems can enable more engaging applications. Be it visual dialog systems, visual question-answering systems, social chatbots, intelligent personal assistants and other AI-powered assistive tools, it can expand the possibilities of machine engagement with humans.
The ability to respond with an emotional and sentimental understanding and use images to convey meaning, instructions or provide reasoning can enhance the quality of conversations humans have with machines. It can add a new dimension of emotion-driven visual communication to the human-machine relationship and make the experience of using AI more engaging for users.
With Paul the Octopus, Nelly the Elephant and now Achilles the Cat predicting match results, animal intuition has been reigning popular for forecasting game outcomes. However, predicting match results in a dynamic sports field poses an interesting challenge for data scientists and experts in machine learning. A wide range of variables could swing the match either way. The direction of the wind, for example, could shift the trajectory of a shot enough to reduce the chances of scoring. Subtle changes in humidity could have an impact on the player’s speed, the striker’s chances of scoring could decline over time, and the team’s formation may have an impact on defensive strategies.
During the 2014 World Cup, Bing users searching for updates on their favorite teams were greeted with a surprising result - predictions on the winning team. This was Microsoft’s first foray into an Artificial Intelligence (AI) powered in-game prediction engine - Bing Predicts. Ever since, Bing has predicted the results of various big-ticket sporting events.
Backed by a machine learning model, Bing Predicts envisages the winning chances of teams in various sporting tournaments. Bing accurately predicted the winners of 6 out of 8 knockout stage games so far, in the ongoing 2018 World Cup, including the victory of France over Argentina. Its prediction of Belgium’s win over England in the group stages was against odds but Bing’s pick persevered.
Today, Bing Predicts can envisage possible outcomes for sports events across the world be it football, cricket, tennis or individual sports like track and field events.
Creating the prediction engine
Predicting the outcome of live competitions is a breakthrough in predictive analysis and machine learning. With many variables, live events like football matches and reality TV shows are rich in data but challenging to predict. Creating a model for accurately forecasting these dynamic events was an interesting experiment for our researchers. Here’s how the model was created and improved over the years.
Every sport is unique and has different parameters. However, all competitions can be divided into two broad groups - individual and team-based. Individual sports like sprints and motor races have a ranking system for contestants whereas team-based sports like tennis and football matches are zero sum games where one side must come out on top.
Our objective with the prediction model was to analyze extensive historical data to predict the winners in team-based matches and to rank the winners in individual sports. For each sport we followed four steps to create a unique and accurate prediction model.
Step 1: Gathering data
Every variable has an impact on the outcome of each match. To make the model as accurate as possible, we gather data on the minutest aspects of historical matches. Data on the structure of the team, age of players, performance records, strength of the schedule, margin of past victories, tendency for home-field advantage, weather conditions, and the texture of the playing surface are all considered for every match.
Step 2: Creating features
Features add further dimensions to the raw data. We apply comparative features such as how well a certain team has performed against another team in the past. Aggregate features like the number of games a team has played helps determine experience. Unique features like the overtake friendliness of a race circuit are also considered. Official rankings of players and teams are a separate feature. The final step is to give each feature a weightage based on the page rank algorithm. The data enriched with game-specific features enables the model to build accuracy.
To predict the outcome of any match, we gather data on each player from both teams. This could include the number of matches played, the goals scored, the goal attempts blocked, the average speed of sprints, and the general position of players. This data is overlaid with features like the percentage of wins one team has experienced against the other in the past, adjusted for the players who have never faced each other before, and weighted by the official ranks of each team published by the official member association. The ranks establish the performance of each team over the course of the tournament in terms of recent competency and fitness. Ranks are normalized for time decay and comparative performances in different tournaments for accurate results.
Step 3: Training the machine
We train the model with feature-enriched data to classify teams and individual players. This classification can be either ‘winner’, ‘loser’ or ‘draw’ in a team-based sport or ‘top three’ or ‘bottom three’ in an individual ranked sport. The model runs analysis on all the data to determine the margin of victory for each match. The choice of the exact model depends on the sport being predicted. The features and data are analyzed multiple times to arrive at a series of different outcomes. The model then creates a prediction based on the aggregate of these outputs.
Step 4: Checking for accuracy
Measuring accuracy is an essential part of the predictive model. The ratio of correct outcomes to incorrect ones is good enough for team-based sports. For individual sports, the accuracy measurement is more complex. We apply Normalized Discounted Cumulative Gain (NDCG) to see how accurate the model is. This gives us a clear indication of the predictive capacity for each rank in the individual sport. For motor racing, we focus on the top three ranks to make the predictions precise since the racers who end up on the podium are crucial.
Cues from ‘wisdom of the crowds’
Statistical data is not enough to accurately predict live games. Bing Predicts applies this model with two more layers of data - anonymized crowd sentiments and real-time updates. Anonymized web activity helps apply the ‘wisdom of the crowds’ to this model while real-time updates on player injuries and unforeseen suspensions help augment the prediction engine.
During our research, we have noted that web activity is less biased than polling. Web trends analysis is hence, more insightful. In fact, listening to the ‘wisdom of the crowd’, can enhance the accuracy of Bing Predictions by 5%, a significant number for accuracy!
The algorithm derives sentiments and real-time data from web activity and public sentiments across social media. This part of the model typically picks up fans’ views and the latest happenings on controversies, injuries, changes in line-ups, suspensions and more. For instance, Germany and Brazil entered the semifinals of the 2014 football tournament as the favorites to win – with Germany ranking second and Brazil ranking third in world rankings. However, just before the crucial match, Brazil lost two key players – a key striker to an injury and its captain and defender to accumulation of yellow cards. This was the key determinant prompting the model to choose Germany as the winner in the pregame predictions.
Over the years the engine has been modified and tweaked to increase precision and provide deeper insights into upcoming events. These come directly from the statistical features that the model uses and crowd sentiments. For example, predictions for the knockout stages of the current football World cup show the strength and weakness of each team and what it would take for any of the teams to win.
Further, facts from their past matchups reveal interesting insights.
Foretelling the future
Sports predictions are different from forecasts of events based on popularity and voting. Indeed, sports are fun because of their unpredictability. The actions of a handful of players can affect the outcome of games and tournaments. The uncertainty makes forecasting sporting events more challenging and fun. Today, our engine provides users predictions for who will win, the percentage of our prediction confidence, and the reasons for each victory across the most popular sporting events.
You can view the experience at https://www.bing.com/search?q=fifa. Bing users can follow our model’s prediction closely to see whether the engine can surpass its own record.
Humans are innately visual by nature. We respond to and process visual data through pictorial charts or graphs better than poring over spreadsheets or reports. Tools like Microsoft Visio and Visio Online help people visualize and leverage the wealth of data they work with. People can simplify and convey complex concepts and data in a universal manner. They can create, store, share and collaborate on visuals and diagrams in real-time.
Visio has helped numerous users create diagrams, flowcharts and blueprints for their data with ease. Over the years, the platform has evolved to keep up with the demands of the digital age. It is now linked to the cloud and offers visual styles that are modern and contemporary.
However, until recently persons with visual impairments or physical disabilities were not able to use the tool to their benefit. To help more people leverage Visio’s features and create visuals despite their limited dexterity, low vision or other disabilities, our team added new accessibility features to the platform.
Here’s how we made Visio more accessible:
Enhancing Visio accessibility
Making a digital tool for data visualizations more accessible presented a unique set of challenges. Unlike a text or image editor, Visio works with unique diagrams, titles, and structures. A flowchart or Venn diagram has more layers of information than a simple spreadsheet or text document. Reading and describing these diagrams to a user with visual impairment (VI) or other disabilities required innovative new means of reading, creating, editing and sharing.
The switch from Microsoft Active Accessibility (MSAA) to Microsoft UI Automation (UIA) enabled better screen reader tools. UIA is the next evolution of our suite of assistive technology products and automated testing tools. UIA offers a number of improvements to the accessibility features of all our products, including Visio.
UIA offers a filtered view of the tree structure of the user interface. In this tree structure, the desktop is the root, the applications are immediate children, and the various UI elements are descendants of those children. This new approach along with other innovations has helped us make Visio more accessible.
Creation and consumption of accessible diagrams
Individual shapes can be identified and described based on localized control types. The Visio diagrams can be read by a Screen Reader, which reads the specific formatting details like size, position and color details for Visio shapes. A Screen Reader also helps understand the connections between the shapes by reading out the start and the end points of connectors. The style set can now be specified so that users have a sense of direction, as well as starting and ending points, while connecting different shapes within flowcharts. Recent upgrades can keep these documents accessible while being exported to a PDF format. Within an exported PDF, these new features can detect linear structures within a tree diagram and read aloud various elements.
To make the diagrams seem as natural as possible for a user, the new engine reads the relationships between various texts, elements and shapes in the diagram. The engine uses this information to create a traversal flow of descriptions for shapes and texts so that the user can easily follow the pattern and understand the flowchart or diagram. Communicating the relationship between shapes to the accessibility tools makes the accessibility features feel more natural for the user. Similarly, the formats have been upgraded to ensure the accessibility features can be changed without compromising performance.
Accessibility features in Visio
Visio’s accessibility features have been designed to be as effortless and natural as possible. Users can easily create new content on Visio with the help of a screen reader such as Job Access With Speech (JAWS) or Windows Narrator. With UIA, the relationships between shapes and diagrams is automatically established and ordered so that a user can easily navigate the diagram. The latest feature, Data Visualizer, helps users transform data from Excel spreadsheets into Visio diagrams. Users can add Alt-Text in the Excel table to make the output diagrams accessible.
With the help of Alt-Text and a defined navigation order, the user can now convert documents to a PDF format and easily share them with everyone.
Enabling a more inclusive experience for all
Making Visio accessible helps us bring this powerful tool to more users, especially students with disabilities. Using Visio with these tools is more natural and seamless than ever before. The diagrams can now be interpreted by screen readers, making this visual platform more accessible for everyone. The platform now adapts to specific user disabilities and preferences to offer a more inclusive environment to create, share and consume accessible diagrams.
By adopting the universal standard for user interfaces and adding unique features for accessibility, our team has managed to open this visual platform to a wider user base than ever before. These modifications and new features are in line with our mission to empower every person and every organization on the planet to achieve more.