Humans typically look at a scene and notice objects vis-à-vis their relationships with each other. For example, consider a laptop on a desk next to a phone, both of which are in front of a computer monitor. Most deep learning models find it hard to grasp such inter-relationships. And because they cannot understand the entanglements, they cannot see the world in the same manner as humans. This creates a massive setback as it limits the robots’ ability to accomplish specific tasks like bringing the spatula at the right side of the stove and right below the cupboard.
This problem might soon be addressed as a research team has developed a new model to understand these inter-relationships. Primarily, their model works by representing individual relationships one by one. After that, it merges all these single representations to understand the overall scene. This facilitates the model to produce images that are way more accurate. Furthermore, text descriptions are used to accomplish this, even when different items have distinct associations with one another. Thus, the study demonstrates that the new machine learning model could help boost the Artificial Intelligence (AI) Market. It could enable robots to comprehend interactions in the world the way humans do.
Further, the framework is advanced enough to understand complex relationships from scenes via text descriptions. This denotes that the model can recognize a wooden table when it is told that it is situated to the left of a blue stool or a red couch that is present to the right side of a blue chair.
The researchers were able to accomplish this target by employing a machine-learning technique known as energy-based models. Thereby, they represented particular object relationships within a scene description. This method allows them to encode each relational description using a single energy-based model, then combine them in a way that infers all objects and relationships.
Breaking down sentences into numerous short pieces for every relationship enabled the system to position them differently. This further helped it to adapt the scene descriptions in a way that wasn’t possible before.
The model has numerous applications. It can be applied to situations where industrial robots are needed to undertake sophisticated tasks that have multiple steps. This may be possible for stacking boxes in the warehouse or maybe assembling appliances etc. Furthermore, the framework also brings us one step closer to a world where machines learn and interact with the environment exactly as humans do.
Related Reports:
Global Artificial Intelligence (AI) in Cyber Security Market 2021 by Company, Regions, Type and Application, Forecast to 2026
Global Artificial Intelligence in Manufacturing Market 2021 by Company, Regions, Type and Application, Forecast to 2026
Global Artificial Intelligence Platform Market Growth (Status and Outlook) 2021-2026
Global Artificial Intelligence Software System Market Growth (Status and Outlook) 2020-2025