Far Reaching Research FRR Project IBM Research See
Far Reaching Research (FRR) Project IBM Research See, Hear, Do: Language and Robots Jonathan Connell Exploratory Computer Vision Group Etienne Marcheret Speech Algorithms & Engines Group Sharath Pankanti (ECVG) Josef Vopicka (Speech) © 2002 IBM Corporation
IBM Research Challenge = Multi-modal instructional dialogs Use speech, language, and vision to learn objects & actions Innate perception abilities (objects / properties) Innate action capabilities (navigation / grasping) Easily acquire terms not knowable a priori Example dialog: command following verb learning noun learning advice taking Round up my mug. I don’t know how to “round up” your mug. Walk around the house and look for it. When you find it bring it back to me. I don’t know what your “mug” looks like. It is like this <shows another mug> but sort of orange-ish. OK … I could not find your mug. Try looking on the table in the living room. OK … Here it is! Language Learning & Understanding is a AAAI Grand Challenge http: //www. aaai. org/aitopics/pmwiki. php/AITopics/Grand. Challenges#language 2 © 2005 IBM Corporation
IBM Research Eldercare as an application § Example tasks: Pick up dropped phone Get blanket from another room Bring me the book I was reading yesterday § Large potential market Many affluent societies have a demographic imbalance (Japan, EU, US) Institutional care can be very expensive (to person, insurance, state) § A little help can go a long way Can be supplied immediately (no waiting list for admission) Allows person to stay at home longer (generally easier & less expensive) Boosts independence and feeling of control (psychological advantage) § Note: We are not attempting to address the whole problem X X 3 Aggressive production cost containment Robust self-recharging and stairs traversal Bathing and bathroom care, patient transfer, cooking OSHA, ADA, FCC, UL or CE certification © 2005 IBM Corporation
IBM Research State of the art § Indoor navigation Minerva from CMU, Jose from Univ. British Columbia No object perception No manipulation capability § Perception & manipulation Herb from CMU / Intel (Kanade), PR 2 from Willow Garage Off-line object model generation No natural language interface § Language learning Ripley from MIT (Deb Roy), HAM from KTH in Sweden Either fetch or carry No procedural learning § Dialog and speech Honda system from IBM, call center handling from IBM No physical presence or action No visual perception of objects 4 © 2005 IBM Corporation
IBM Research Business Model OEM buy hardware IBM $70 B / year add software and services Third Party customers 5 © 2005 IBM Corporation
IBM Research Costs & revenue potential § OEM sales price for hardware $6000 Electromechanical parts $1300 Onboard computer $500 Assembly (15 hrs x $80 / hr) $1200 + 30% Sales & distribution + 20% profit $3000 § Value-added wholesale price (w/ software) 10% Continued R&D $1500 30% Sales & distribution 20% Profit $3000 $15, 000 $4500 Price = Less than a new car § Total cost of ownership Lifetime = 3 years $5000 / yr Service (15 hrs / quarter x $50 / hr x 4 quarters) $3000 / yr Ø Effective wage (40 hrs / wk x 50 wks / yr = 2000 hrs / yr) § Eldercare market in US (x 3 if EU and AP also) Total US population Ages 75 -85 Suitable (ability level, desire, finances) 6 $24 B / yr $8000 / yr $4 / hr resell robot + value added software + field service 3 million 300 million 10% Ø Manufacturing business ($2000 / robot yr) $6 B / yr Ø Services business ($3000 / robot yr) $9 B / yr © 2005 IBM Corporation
IBM Research Sample business case § Home eldercare now (employer costs) $25, 000 / yr 1 aide from 8 am to 6 pm = 10 hrs 50 wks x 5 days / wk x 10 hrs / day = 2500 hrs / yr Federal min. wage = $7. 25 / hr +38% overhead (FICA + 401 K + medical) = $10 / hr § Aide’s activities: Help with clothes, hygiene, meals Odd tasks such as fetching objects Sitting around watching TV § Alternative: Half-time aide + robot $20, 500 / yr Human still helps with clothes, hygiene, meals Robot potentially available after hours and on weekends No problem with robot Training, Turnover, and Trust (stealing) § Value proposition (to client): 30% more hours @ 10% less cost Split savings with customer ($50, 000 $45, 000 per client) Human 5 hrs + robot 8 hrs = 13 hrs / day during week 10% less revenue but 22% more profit (= $6. 6 B / yr extra profit if 100% market share) Bill at $20, 000 - $3000 service = $17, 000 / yr revenue 10. 6 months payback on $15, 000 purchase 7 © 2005 IBM Corporation
IBM Research What’s different and important § Speech-driven interface No headset required (far field), can learn new nouns and verbs § Multi-modal dialog Responds to gestures, exploits synergies between modalities § Manipulation as well as mobility Not just a walking telephone, can do useful physical work also § One-shot learning No turntable scanning, not 100’s of examples, no trial-and-error experiments § Cost containment Vision instead of special-purpose sensors and precise mechanicals 8 © 2005 IBM Corporation
- Slides: 8