MUM: A new AI milestone for understanding information

257 points by chris_f 4 years ago | 208 comments
  • floatrock 4 years ago
    A not-so-subtle reading shows Google is doubling down on ecommerce applications here:

    > It could also understand that, in the context of hiking, to “prepare” could include things like fitness training as well as finding the right gear.

    > fall is the rainy season on Mt. Fuji so you might need a waterproof jacket.

    > MUM could also surface helpful subtopics for deeper exploration — like the top-rated gear or best training exercises

    > you might see results like where to enjoy the best views of the mountain, onsen in the area and popular souvenir shops

    Or, my favorite line:

    > MUM would understand the image and connect it with your question to let you know your boots would work just fine. It could then point you to a blog with a list of recommended gear.

    (in other words: "Thanks for showing you're interested in hiking gear. Here's a lot of hiking gear you can buy.")

    • rexreed 4 years ago
      There's an even bigger picture than possibly monetizing ecommerce revenue (through... ads?). The biggest impact is that they get to use all the content generated on the Internet to create these search "results" that synthesize information from multiple sources without ever having to share traffic or ad revenue with those content sources. Clever.
      • judge2020 4 years ago
        This really is a section that needs regulation. You basically have to use and allow Google to crawl your site if you want a website findable by 95%+ of Americans, so websites really should be able to tell google how they're allowed to use the scraped data instead of just 'for anything'. Maybe a meta tag would work well.
        • d110af5ccf 4 years ago
          > websites really should be able to tell google how they're allowed to use the scraped data

          Isn't it a bit more complicated than that though? While you certainly aren't entitled to republish things, you (ie anyone and everyone) have traditionally been free to consume public material in whatever way you see fit. The precedent from the recent LinkedIn case regarding scraping supports this.

          Also you focus on Google, but anyone with sufficient resources can scrape the public web (anti-bot cat and mouse games notwithstanding I suppose).

          (Paywalled sites that allow Google to scrape them for search indexing purposes are an interesting edge case though.)

          • cma 4 years ago
            Should human experts, like someone preparing a blog post on gear, also have to pay all the blogs and books they read in their research?
            • Mauricebranagh 4 years ago
              And how else would a search engine work ? I am not Shure what your proposed solution would look like.
            • bjterry 4 years ago
              In the current world, information wants to be free. In the AI-powered future, knowledge wants to be free.
              • danielheath 4 years ago
                Those who created information had a problem with that past. Those who create knowledge will have a problem with that future.
              • mhoad 4 years ago
                I don't know if this is actually true or not but I suspect that a big part of their thinking is that "we are just presenting 'facts' and facts are not subject to copyright laws".
                • jonnycomputer 4 years ago
                  Forgive the analogy but that sounds like a parasitic relationship, and one that might kill off, or at least impoverish, its host. Even if Google isn't doing that, the potential exists. The counter is paywall, I suppose.
                • colordrops 4 years ago
                  Another not-so-subtle reading shows google doubling down on being "responsible" which has a lot of collateral damage when they block or de-emphasize legitimate results that don't fit their own goals.
                  • sangnoir 4 years ago
                    It rings a little hollow when they fired members of/disbanded their nominally independent internal ML Ethics unit after a member published a paper raising some flags on the kind of models Google is betting its future on.
                    • dragonwriter 4 years ago
                      “Responsible” AI was what Google invented after the Ethical AI purge.

                      The appearance is that it’s about AI being responsible for advancing corporate image and interests.

                  • glenstein 4 years ago
                    I don't think the language you've quoted was explicitly intended that way. But I think you're onto something. I think high-context answers open up all kinds of new contextual surfaces where ads can be placed, products + product categories suggested.
                    • derefr 4 years ago
                      I don’t know if it’s “e-commerce” specifically, or just a more general fact that Google own a search engine, and want to surface URLs from their index as answers to questions, when appropriate. And, when you think about it, why would you be linking to a page — rather than giving a straightforward answer — unless you’re linking to a product page / review / other page that offers you a direct means of solving a problem that goes beyond a conversational answer?
                      • floatrock 4 years ago
                        > unless you’re linking to a product page / review / other page that offers you a direct means of solving a problem

                        Embedded in this answer seems to be the mindset that only buying things will solve problems.

                        Don't get me wrong -- I'm not a consumerist luddite, I use my credit card points like any good and proper citizen -- but when your mindset is "all problems can be solved by buying more shit", well, that's a pretty lonely existence.

                        Google's gotta make money, and helping people buy useful shit is a fine way of doing it, but just don't fall into the mindset trap that everything solution in life is just a Google Pay away.

                        • derefr 4 years ago
                          No, let me rephrase: often, problems can be solved with words. In those cases, the conversational agent wouldn't link to anything. It would just solve your problem.

                          But if a problem being solved necessitates linking to something, then what kind of problem is that likely to be? Usually one where you need to stare at something, mull over a bunch of details, and make a decision. What kinds of webpages are those? Usually — for public clients — those are product pages.

                          (Another potential use-case is that a conversational agent could help people configure to software/services by deep-linking to configuration screens — but that's not really a thing Google Search could integrate with.)

                      • zepto 4 years ago
                        The Google of the future is a conversation with a salesman.
                        • tachyonbeam 4 years ago
                          Or maybe it's a personal assistant that lives in your phone and asks you how you're doing everyday, acts like a friend, inquires about your mental health and well-being... And then subtly nudges you in the direction of buying X,Y,Z thing or service to help you fill that existential void in your life.
                          • zepto 4 years ago
                            A digital cult leader that can talk to everyone in the world!
                      • rexreed 4 years ago
                        Search quality at Google has been decaying over the past decade. Accuracy and quality of search results is compromised to optimize advertising revenue, penalize competitors or neutralize threats, and cater to the various needs of political or regulatory authorities.

                        Google's search was at its peak in 2008 when advertising hadn't fully compromised search quality. Google is an advertising business that supports its otherwise money losing properties. Why will things change in the future because you can synthesize data from multiple sources only to compromise that quality with the realities of Google's business model?

                        • mxcrossr 4 years ago
                          > Search quality at Google has been decaying over the past decade.

                          Is there any empirical evidence to back this up? If we’re talking anecdote, I swear as soon as google started labeling ads more clearly people complained more about ads. And if google really is getting worse, I would expect that I would get frustrated with DuckDuckGo bit getting the job done less often.

                          I do share your concerns though. Just look at YouTube as an example. You search for something, and half way down the page are completely unrelated videos that you watched before. This is because YouTube just wants you to click, they don’t care about you finding what you were after.

                          • ashleyn 4 years ago
                            The one example I usually give people is the one that led me to the realisation myself.

                            Try searching "how valve index works" or "how valve index controllers work". My interpretation of "how it works" is "technical information on how an item operates". Google will interpret this instead as "how well it performs its intended functions" and flood me with both links to purchase the Valve Index as well as endless reviews. Results on Google are not tailored toward retrieval of factual information anymore. They're tailored to ordinary, garden-variety consumers, and obviously designed to sell you a Valve Index.

                            To this day I still have not found really good information on how the controllers in the Valve Index actually work. All I get are pushes and nudges into getting me to buy something.

                            • moultano 4 years ago
                              Those are good examples! I'll pass them along to debug. I think what's happening is that the wording is ambiguous enough that it's colliding with concepts like "how well does valve index work." If you search for "how does valve index tracking work" then you get results like this, which is more in line with what you're looking for. https://gizmodo.com/this-is-how-valve-s-amazing-lighthouse-t...
                              • numpad0 4 years ago
                                I think what happened was they pivoted from _document search_ to an interactive oracle app. I would have used “index controller principles” to get the documents describing it, which no longer works. And I think what you want is the document search back.

                                And these days they throw a lot of machine translated ripoff sites as well as some malvertising dummy type sites. It’s really something.

                                • rrdharan 4 years ago
                                  > I still have not found really good information on how the controllers in the Valve Index actually work.

                                  Isn’t the Occam’s razor explanation here just that that information is not actually available on the web - not that Google is hiding it from you?

                                • ergot_vacation 4 years ago
                                  I'll give you one: Google image search is so insanely hobbled by the copyright squad (and possibly right to be forgotten, etc) that it's essentially worthless now. Reverse image search used to be a valuable tool. Now it just spits out generic garbage, even for images that clearly have a wide presence on the net. These days if I need to try to hunt for something, I just pull up Yandex and get the results Google used to give five or ten years ago (better even, since there's a bunch of neat added features like object recognition and automatic OCR).
                                  • mdoms 4 years ago
                                    I have an example from just yesterday. I am new(ish) to the rails ecosystem and spotted a `.ruby-version` file in the root of the repository. I didn't know what it was so I googled `.ruby-version`. The results were less than helpful because Google interpreted that as a search for the term `ruby version`. Fine, whatever, I will just fall back on double-quoting the whole thing, like `".ruby-version"`. A couple of years ago this would have worked perfectly - I know, because I've been doing it for years. But Google no longer respects this kind of search query, instead it tries to be too clever by half and end up being worse than useless.
                                    • posixplz 4 years ago
                                      I miss the days when punctuation marks were significant to google searches. And the days when you could use logic operands in searches like +&!

                                      I’m glad I learned POSIX and especially Linux when searches were evaluated more literally. It was simple to locate relevant technical pages.

                                      It’s a shame google doesn’t offer legacy search.

                                    • mmahemoff 4 years ago
                                      I can only give you anecdotal evidence, which is that myself and many others (per social media) are constantly appending "reddit" because the first N results are all e-commerce sites or thinly-veiled promos for them in the form of listicles.

                                      e.g. Search for "camera with wifi" versus "camera with wifi reddit". If you're doing any research, you will find the latter more useful. Now I know some will say many people just want to buy the product and will be satisfied with a direct link to purchase, but the thing is a good search engine will mix in different types of results. What you get here is dozens of virtually identical results with any genuine info - e.g. a recent post on a reputable personal blog or a social media post - completely buried.

                                      Do any other engines do it better? Maybe not. But Google itself certainly used to do it better, if only because it didn't have the majority of the internet trying to game its algo.

                                      • basch 4 years ago
                                        At this point, I basically need to know or find an authorities source first. PCMag still appears to be a good resource, moreso than Tom's Hardware and Wirecutter at times (I think.) It's sort of the same shit, but they seem to put a little more work into being right. Too many listicles that are "10 best" are really "the first 10 the author saw while searching." When coronavirus started, theres no way anybody writing most of those "review rollups" ordered and tried on any of the masks they assembled into posts. There are fewer and fewer places that seem to be trying things themselves before recommending them.

                                        https://www.pcmag.com/picks/the-best-sony-mirrorless-lenses

                                        Google really needs an authoritative mode that strips out or deduplicates the news cycle and blogsphere. Something that can tell that every post is basically the same thing and turns it into one entry. I want uniqueness and quality. I dont need the same opinion repeated across 10 urls.

                                        A CTRL+F of the MUM page didnt find the word duplicate once.

                                        • nsonha 4 years ago
                                          next generation of search engines should have a config where people can customize their algorithm.

                                          I don't feel that google did worse over the years, more like the commerce part of the internet overtaking the information part.

                                          Actually you make a good point, since google has a shopping tab maybe they should show ads over there only and dedicate the "normal" google to general info

                                          • selfhoster11 4 years ago
                                            Not to mention that most of those listicles are an extremely shallow cross-section of available products. Reddit is far more willing to suggest off-the-wall options like used 5 year old hardware that still performs better than the newest shiny, and uncovers far more slightly options that are slightly off the beaten, consumerist path.
                                            • mmahemoff 4 years ago
                                              And also, a search engine with a greater bias on UX would be more personalised, so it would show those kinds of results to the people who regularly seek them.
                                              • numpad0 4 years ago
                                                It’s funny Google Search still does good at what they were intended for, searching intelligence by keyword to gain understanding.

                                                Google neither care to confirm or deny but the origin of Google Search is reportedly some CIA/NSA internal program. Imagine there’s a ton of random Soviet documents, and you wanted to know what the codename chikensandwich in Slicebread division might refer to, or which document is referred to the most from other documents regarding the topic. Don’t you think, Google Search as you remember it does exactly that.

                                                And this conspiracy theory explains why Search, Maps and Mail and very few other products built by such a laid back disorganized organization work so well and only those work well, that it’s because those are technology dump from NSA and Google is just an elaborate museum shop allowed to capitalize on their heritage.

                                                1: https://qz.com/1145669/googles-true-origin-partly-lies-in-ci...

                                              • mark_l_watson 4 years ago
                                                I came here for a technical discussion of MUM, but your comment just triggered something: my wife and I pay for ad-free YouTube (part of the music bundle). As a paying customer, I just don’t understand why they would annoy me with showing videos that I have already seen. A better UI would be a top level menu option to show history of watched video (and search writhin already watched material). Then the default page could remove already seen material.

                                                I am a happy paying customer for GCP, Play Books+Movies, etc., but I think they need to step up the quality of their services for paying YouTube customers.

                                                Thanks for your comment.

                                                • techbio 4 years ago
                                                  Not likely to find empirical evidence of search results quality, but I think there might be for an overall lowering of content quality. It is so much cheaper to mass produce unimpressive content than ever before.
                                                  • monkeybutton 4 years ago
                                                    How about image search? In 2008 there weren't product images and shopping campaign ads inline with the rest of the results. Also reverse image search is now being supplanted by Google lens search, which again serves up products and ads bases on what can tag in your photo.
                                                    • marderfarker2 4 years ago
                                                      Huh. Just ublock all those annoyances away. Didn’t know Google search is so ad ridden.
                                                    • bogwog 4 years ago
                                                      > If we’re talking anecdote, I swear as soon as google started labeling ads more clearly people complained more about ads

                                                      When did Google start labeling ads more clearly? See: https://searchengineland.com/figz/wp-content/seloads/2016/07...

                                                      Today the labeling consists of the letters "Ad" in black next to the result.

                                                      source: https://searchengineland.com/search-ad-labeling-history-goog...

                                                      • ipaddr 4 years ago
                                                        Yes the number of ads at the top of the page has increased. The colors have blended ads into content. The number of sites shown is reduced. The amount of content indexed available has been reduced.
                                                        • UncleMeat 4 years ago
                                                          > Is there any empirical evidence to back this up?

                                                          No. This is the same HN post as "Facebook is dying, pretty soon all their users will be gone and they'll collapse".

                                                        • pradn 4 years ago
                                                          How much of it is Google getting worse and how much of it is garbage websites hyperoptimizing for SEO? Practically all news websites are chock full of ads. There's tons of filler websites that just copy/paste text from Wikipedia, etc. Of course, Google could do a better job, but it's codependent evolution.
                                                          • rexreed 4 years ago
                                                            It's Google's prioritization with ads and preferred sites taking priority even over those SEO-optimized sites.

                                                            Google would much prefer to be the sole source of your traffic instead of pushing you to other sites. Google's business is advertising. Why would they want to lose that traffic?

                                                            Check this article about the Google MUM announcement, which basically says the same thing:

                                                            "MUM is part of Google’s long-term shift away from ranked search results and toward the creation of AI algorithms that can answer user questions faster—often without ever clicking a link or leaving Google’s results page. (Think, for example, of the “knowledge panels” that now appear at the top of many search results pages and display an answer from a website so you don’t have to visit the site yourself.) This shift promises to reduce the amount of work it takes to find information through Google. But it’s not clear that this is a problem in need of a solution." [0]

                                                            The Google of today is not the Google of 2008. Google in 2008 was a search engine. Today it's an advertising business that would much prefer you not leave Google properties.

                                                            [0] https://qz.com/2010802/googles-mum-is-making-search-worse-by...

                                                            • shadowgovt 4 years ago
                                                              > This shift promises to reduce the amount of work it takes to find information through Google. But it’s not clear that this is a problem in need of a solution.

                                                              Getting people useful information faster is the problem in need of a solution when you're Google. There isn't a point where that problem is solved; organizing the world's information and making it universally accessible and useful is an unbounded goal.

                                                              • fastball 4 years ago
                                                                Is this just speculation on your part or do you have a source for this claim?
                                                              • hn_throwaway_99 4 years ago
                                                                Garbage websites hyperoptimizing for SEO have existed since the late 90s. I agree with the GP, the issue I have seen over the deterioration of search in the past 5-10 years is specifically a result of their business model:

                                                                1. Any remotely commercial search has an entire first page of ads, organic results are pushed way down.

                                                                2. Google has made it difference between ads and search results as minimal as possible. I long for the days of the early 00s of big yellow boxes.

                                                                3. On many pages the amount of content Google stuffs in at the top before you get to actual search results gets more annoying every year.

                                                                Honestly, I wish I had a button that made Google result pages look like they did 15 years ago.

                                                                • astrange 4 years ago
                                                                  I feel like the main problem I have with Google results is that they never surface anything interesting or old. There are a lot of searches where it returns nothing useful, but if you add "reddit" it becomes useful.

                                                                  Besides that, they haven't fought SEO enough on image search, since Pinterest took it over for years.

                                                                  • mycall 4 years ago
                                                                    > I wish I had a button that made Google result pages look like they did 15 years ago.

                                                                    Or a browser extension.

                                                                  • onion2k 4 years ago
                                                                    How much of it is Google getting worse and how much of it is garbage websites hyperoptimizing for SEO?

                                                                    Those are the same thing. If garbage websites can game their way up the search listing then Google is failing.

                                                                    This is a simple problem of competition. Google doesn't have any, so they don't need to provide a good product. They can optimize for ad placement and revenue instead of search quality because users perceive that they have no real choice but to use Google. If another search engine manages to get some real market share Google results will get much better again.

                                                                    • nsonha 4 years ago
                                                                      Idk in the sense I feel that google has been doing better in fighting SEO in over time. I used to get crap results, but then again I was less experienced and did not use ads blockers
                                                                    • hackinthebochs 4 years ago
                                                                      I would have agreed with you a few weeks ago. I recently switched my browser to the new Edge and I stuck with Microsoft's default Bing search because fuck Google and all that. I had two occasions in two days where Bing's search frustrated me with their results despite many efforts to tweak the query. I switched over to Google and its first result was exactly what I needed. These were cases where the page didn't contain the phrase I was looking for so it had to interpret/translate it to find the correct information and it did a great job.

                                                                      A few years ago I felt that Bing and Google search were basically on par. Google has definitely upped the ante regarding search in the last couple of years. It may just be that it does more interpretation than you've come to expect so you need to retrain yourself how to query it. There are also occasions where verbatim search is required for technical topics. But Google's search quality has shown real improvements.

                                                                      • superasn 4 years ago
                                                                        > Search quality at Google has been decaying over the past decade.

                                                                        This one line is echoed again and again on HN and yet in my experience all its competitors still pale in comparison. I hate Google now as much as the next HNer for its evil shenanigans but their search is still superior and if a browser comes with a default like Bing or Ddg (like ff on linux mint) the first I do is change it back to Google since the results are truly aweful otherwise.

                                                                        • basch 4 years ago
                                                                          But it is worse than it was a decade ago. Across the board. There's more pollution than ever on the internet, and search engines are doing a worse job of separating the diamonds out.
                                                                          • 1024core 4 years ago
                                                                            Just look at the number and size of ads. Quite often, the entire first page of results is ads, and you have to scroll down to find the organic results.
                                                                          • Der_Einzige 4 years ago
                                                                            Yeah - seems to jive with my experiences. It's a tough pill to swallow, but bm-25 and tf-idf along side pagerank continue to be superior to dense vector methods for search. Even dense-vectors with re-ranking models afterwards don't perform as well. I've been sad to see that models like BERT are becoming more prolific in search as they are a significant portion of why googles search has gotten worse...
                                                                            • rexreed 4 years ago
                                                                              They key here is that transformer-based "search" isn't actually providing links to the sources of information such as how search works now, but rather synthesizing information as a result of being trained on the corpus of Internet data.

                                                                              In this way, Google gets all the value from Internet properties they don't own without having to push any traffic to those sources. So, they get their cake and eat it too. They create a way to regurgitate information from the vast trove of info on the Internet without ever having to share traffic with those sources by moving traffic from their search engines to those sites, like they do now.

                                                                              They get to sell advertising to those who want to capture eyeballs for search results, without having to share any ad revenue with the content providers that are powering that transformer-based search.

                                                                              Ain't it grand?

                                                                              • wokwokwok 4 years ago
                                                                                This has been coming for some time now, to be fair.

                                                                                Now that it's pretty close to actually being here, the grim reality is that anyone who was expecting the status quo to just march on like always is going to get screwed over; and the a new wave of successful businesses will adapt to it and thrive.

                                                                                It's called 'disruption', and it's a bit disappointing to see people here of all places complaining about it.

                                                                                Sure, I get it, it's google, and if it was some nippy unicorn doing it people would be more enthusiastic, but ML is hard to do right, and having someone who's actually pushing the boundaries of whats possible is, in my opinion, pretty cool.

                                                                                BERT made a huge contribution, and if this eventually flows out to everyone else to use, that's great news.

                                                                                ...and, if google stops sending traffic to some websites, well, too bad. We'll adapt; so will others.

                                                                                The ones that can't will disappear.

                                                                                • visarga 4 years ago
                                                                                  > They get to sell advertising to those who want to capture eyeballs for search results, without having to share any ad revenue with the content providers that are powering that transformer based search. Ain't it grand?

                                                                                  Reminds me of spammers making spinned articles.

                                                                                • mrfox321 4 years ago
                                                                                  Source? I am quite intrigued by this anecdote for information retrieval.
                                                                              • aledalgrande 4 years ago
                                                                                Content of the article:

                                                                                - 1000 times more powerful than BERT, but still transformer architecture

                                                                                - trained on 75+ languages, can transfer knowledge between languages

                                                                                - can do text and images (not audio and video yet)

                                                                                - can understand context, go deeper in a topic and generate content

                                                                                Not much apart from their words about how amazing it is. Paper? Demo?

                                                                                • rubatuga 4 years ago
                                                                                  Lol, they state that their model is a 1000 times more powerful than BERT? Under what metric?
                                                                                  • YetAnotherNick 4 years ago
                                                                                    According to my understanding they are referring to parameter count. If we go by that logic, BERT has 340M parameters. GPT3 has 175B. So this will have 340B parameters?
                                                                                    • aledalgrande 4 years ago
                                                                                      That's what I was wondering! Such gibberish
                                                                                      • contravariant 4 years ago
                                                                                        Well so far the're mostly talking about what it would be able to do, so it's probably more wishful thinking than any exact metric.
                                                                                        • 4 years ago
                                                                                        • disabled 4 years ago
                                                                                          > trained on 75+ languages, can transfer knowledge between languages

                                                                                          There is zero possibility that Google accomplished proper "language transfer" with the vast majority of Silicon Valley programmers being native English speakers.

                                                                                          In some languages, if you accidentally use a wrong single syllable in any sentence, you can end up saying something extremely embarrassing--and entirely different. This is the case with many Slavic languages.

                                                                                          This is a memorable "classic" [1]:

                                                                                          > "Tony Henry belted out a version of the Croat[ian] [national] anthem before the 80,000 crowd, but made a blunder at the end. He should have sung 'Mila kuda si planina' (which roughly means 'You know my dear how we love your mountains'). But he instead sang 'Mila kura si planina' which can be interpreted as 'My dear, my penis is a mountain'."

                                                                                          Many languages are much more grammatically complex than English, and also have an unbelievable amount of implicit contextual information derived from the grammatical morphology. For example, Slavic languages tend to be this way. The Slavic language that I speak, Croatian, tends to be very clean, direct, and concise, while being extremely complicated grammatically. Also, we have a lot of the same words for the same thing in Croatian, which in combination with the complicated grammar, it makes it a very expressive language. English, however, can be more expressive, in the sense that it allows for more figurative language, like with the usage of idioms.

                                                                                          [1] BBC: Anthem gaffe 'lifted Croatia': http://news.bbc.co.uk/sport2/hi/football/7109058.stm

                                                                                          • gok 4 years ago
                                                                                            Modern NLP architectures do not explicitly model language structure. Even in English, the model isn't directly told anything about about how words work. So the native language of the human authors of the model is (in principle) irrelevant to how effective the system is.
                                                                                            • TulliusCicero 4 years ago
                                                                                              > There is zero possibility that Google accomplished proper "language transfer" with the vast majority of Silicon Valley programmers being native English speakers.

                                                                                              This speaks to ignorance of who Google employs. A ton of the engineers are immigrants there. When I was on Google Photos in MTV, I'd estimate it being about evenly split between native, English-first speakers, vs people who were either non-native English speakers or grew up with two languages simultaneously (children of first gen immigrants in the US).

                                                                                              Silicon Valley has a huge amount of cultural and ethnic diversity, so I don't know why you would make this mistake.

                                                                                              • yoz-y 4 years ago
                                                                                                > There is zero possibility that Google accomplished proper "language transfer" with the vast majority of Silicon Valley programmers being native English speakers.

                                                                                                I don't know the people who worked at this project, but you do realise that Google employs swaths of programmers that are not native English speakers?

                                                                                              • osipov 4 years ago
                                                                                                There is nothing here but a promise. Back in the day we called this "vaporware".
                                                                                                • gerash 4 years ago
                                                                                                  I don't think it's vaporware but the blog post with all these big claims like 1000 more powerful than BERT (based on our arbitrary cherry picked metric) makes one cringe.

                                                                                                  Here's my guess: Some team under web search trained a large Transformer based model but with some adjustment here but now on a massive dataset from the crawled web pages using tons of TPUs. It made an incremental improvement to the search quality metrics and was shipped to production.

                                                                                                  • Lyapunov_Lover 4 years ago
                                                                                                    We sort of already know that these models scale in such a way that a model with 1000 times the parameters is, indeed, 1000 times more powerful. We haven't found a ceiling effect yet, so the onus is on the skeptics. These things scale.
                                                                                                  • xapata 4 years ago
                                                                                                    It's Schrodinger's vaporware. We'll find out some years from now. In Perl 6's case, what, 12 years after the announcement?
                                                                                                    • azinman2 4 years ago
                                                                                                      Except this is Google not some startup.
                                                                                                      • detaro 4 years ago
                                                                                                        Vaporware also happens with established companies.
                                                                                                      • bpodgursky 4 years ago
                                                                                                        After seeing Alpha* solve Go, Chess, and protein folding in the past ~3 years, I think it would be pretty silly for your prior to be discounting any Google AI project as vaporware.

                                                                                                        Their models accomplish ridiculously powerful things. Tbh I think it's far _more_ likely the answer is "this is crazy powerful, but the engineers didn't feel like writing a blog post about it, and the marketing team hasn't figured out how to monetize it yet".

                                                                                                        • MathYouF 4 years ago
                                                                                                          If there's anything SoTA AI researchers love and have experience doing it's writing blog posts and papers explaining how.

                                                                                                          The lack of details makes me think they're either hiding a new technique they'd rather keep secret because it provides a competitive advantage, or that it's really only a marginal improvement over existing NLP models (or an ensemble of them with nearly no improvement on any given metric) and the 1000x improvement is on a metric that no actual ML scientist would respect.

                                                                                                          I don't have the slightest bit of information about Google's AI team to know if those are the only two options and if so which is more likely.

                                                                                                          • visarga 4 years ago
                                                                                                            I think showing the model would immediately trigger the critics to nitpick it like the famous "He is a doctor. She is a nurse." case, so they just don't show it until they figure out a way to avoid that. Moreover, language models are easy to trick into politically incorrect conversations and porn. AI Dungeon's GPT-3 was writing lots of porn, for example.
                                                                                                      • cromwellian 4 years ago
                                                                                                        In most sci-fi, you ask the ship computer a question and it can answer using the sum total of all human information.

                                                                                                        But judging by the comments her, when Captain Picard asks the ship how long to Starbase 17 at Warp 9, rather than answer you want it to tell the Captain to visit WarpTravelCalculator.com

                                                                                                        If you publish information in this world, there’s nothing preventing people from learning it and rewriting it in a new way. Humans do it all the time and they don’t pay the people they learned it from a portion of proceeds.

                                                                                                        Future AI will do this too. I want machine learning to read every book and paper ever written and be able to answer queries and summarize things for me.

                                                                                                        We may need to find a better model for encouraging content contribution to society besides copyright and demanding royalties on every use.

                                                                                                        • mfer 4 years ago
                                                                                                          The analogy here doesn't work well for a few reasons....

                                                                                                          1. It mixes mapping math calculations with published information like texts.

                                                                                                          2. The AI in star trek worked to serve the end user, in this case Picard. In our world the AI systems are designed to serve the software's owner such as Google. It's not trying to give you the best answer. Instead it's trying to provide you responses that make Google the most money or get them into positions of power and influence the leaders want.

                                                                                                          3. Star Trek takes place in a world where the Federation doesn't use money and everyone is motivated to put in a hard days work. On most planets they don't have poor. This does not fit the societal cultural dynamic we have now.

                                                                                                          > We may need to find a better model for encouraging content contribution to society besides copyright and demanding royalties on every use.

                                                                                                          Right now we have a problem where people are trying to step on content creators. I was reading an example of where singers were trying to get added to songs as writers when they didn't write songs so they could get more of the writers royalty from sales. We live in a world where some will beg, borrow, steal, plagiarize, and generally try to hurt others to get a leg up. Including many at big businesses who would leverage AI for that.

                                                                                                          We may hope for the best but we should plan for the worst.

                                                                                                          • selfhoster11 4 years ago
                                                                                                            Most of those starship computers are autonomous. In the current "AI" model, they would be reduced to a mere glorified Amazon Echo speaker. I think that's an important distinction to have.
                                                                                                            • zepto 4 years ago
                                                                                                              Very much this. People yearn for a world of a giant number of websites and software packages like in the old days, but the reality is that a humane computer may not need a lot of different interfaces.
                                                                                                            • anigbrowl 4 years ago
                                                                                                              When I tell people I work on Google Search, I’m sometimes asked, "Is there any work left to be done?" The short answer is an emphatic “Yes!” There are countless challenges we're trying to solve so Google Search works better for you.

                                                                                                              Sorry to be off-topic but it's hard to get excited about blue sky ventures when the search UI offers no capability for simple things like delivering search results in date order. You can filter results by date, but not sort them.

                                                                                                              • bigyikes 4 years ago
                                                                                                                I would bet that sorting isn’t so simple, at least if you want it to be any good. If you did a naive chronological sort, I imagine you would end up with a whole lot of irrelevant results at the top. There is just too much stuff out there.

                                                                                                                To be useful, your “sort” would really just need to be another parameter to the existing relevancy model. And if you did that, then people would probably complain that “it’s not a real sort” and we’re back to square one.

                                                                                                                Edit: You know what, this probably is simple for Google, because they’re freakin Google. To your point, I guess they probably don’t do this because money.

                                                                                                                • zepto 4 years ago
                                                                                                                  > I guess they probably don’t do this because money.

                                                                                                                  Exactly this. There are many controls they could have given us to trivially improve search for end users without needing this AI, but they would have made search less good for their customers, the advertisers.

                                                                                                                  • moultano 4 years ago
                                                                                                                    It's really hard to design something like that that anyone would get value out of using because matching the query isn't a binary notion.
                                                                                                                  • benhurmarcel 4 years ago
                                                                                                                    Another simple thing is that there's no way to not get localized results.

                                                                                                                    I'm currently in Spain. I'm not Spanish. If I want results that don't have to do with that country, and aren't in Spanish, I need to use Duckduckgo. Google is unable to not give localized results.

                                                                                                                    • selfhoster11 4 years ago
                                                                                                                      google.com/ncr or use a VPN
                                                                                                                      • benhurmarcel 4 years ago
                                                                                                                        google.com/ncr doesn't work anymore. Both the interface and results are localized in Spanish when I try it.
                                                                                                                    • abeppu 4 years ago
                                                                                                                      Should you always be able to sort results by date? If I search for "California", doesn't really ever make sense to date-sort all the pages that match?
                                                                                                                      • taeric 4 years ago
                                                                                                                        Is a good way to find things you had seen in the past, but don't recall exact date.
                                                                                                                        • shadowgovt 4 years ago
                                                                                                                          The range filter exists for that.

                                                                                                                          Sorting by date wouldn't work for that. You'll have a ridiculous number of pages of oldest search results (or newest, depending on sort order).

                                                                                                                        • taneq 4 years ago
                                                                                                                          Sure it does. I might want things written about California in the past week, or in October last year, or in 2005.
                                                                                                                      • pradn 4 years ago
                                                                                                                        The way search works:

                                                                                                                        1) terms are split into tokens 2) tokens are looked up to find documents 3) documents are ranked by scoring functions

                                                                                                                        I suspect sorting by chronological order might require too many document metadatas to be retrieved at step 2. (A lot of filtering occurs between steps 2 and 3.)

                                                                                                                        • johncena33 4 years ago
                                                                                                                          A simple explanation is product-wise this feature request doesn't make sense. The number of users will be using the said feature is not worth the effort the amount of effort needed to implement, maintain and operationalize in a product of literally billion users.
                                                                                                                        • Nition 4 years ago
                                                                                                                          I have the same feeling every time Amazon announces some amazing thing and their Top Rated sort still isn't weighted by review count.
                                                                                                                        • cblconfederate 4 years ago
                                                                                                                          I really hope Google gets some competition in their NN endeavors because they are creating an economy that sucks in free information and eventually spews out buying recommendations. In the past they would compensate websites for providing the precious raw material for their results with advertising. With DL models websites don't need to get anything back. This will lead to stale information or pretty much end the web
                                                                                                                          • azinman2 4 years ago
                                                                                                                            You’re being downvoted but it’s actually an interesting issue. Many companies (yelp) already have suffered from quick results… at a certain point Google will have a hive mind but little reason to have you go any further. This is good as a user (hypothetically), but does not contribute back at all to the producers of such information who may have additional value to unlock.

                                                                                                                            Meanwhile the Reddit’s and whatnots can’t afford to not have Google index them, so this is just the price of admission. I wonder if they need an expansion to do not crawl that lets you specify how the data could be used?

                                                                                                                            • shadowgovt 4 years ago
                                                                                                                              Are there other reasons than financial compensation that someone would put facts on a web page?
                                                                                                                              • erikerikson 4 years ago
                                                                                                                                They believe it increases the probability of a world outcome the publisher prefers (e.g. activism, advancement of humanity, ...).
                                                                                                                                • davedx 4 years ago
                                                                                                                                  Do you think Wikipedia is driven by financial compensation?
                                                                                                                              • ping_pong 4 years ago
                                                                                                                                Wasn't Google supposed to have some sort of AI that could make phone calls for you? It looked amazing when they demo'ed it but I haven't heard diddly squat since then. Did they cancel that project?
                                                                                                                                • refulgentis 4 years ago
                                                                                                                                  It works just fine and has been active for a year or so now, except in one state (Indiana?)
                                                                                                                                  • datguacdoh 4 years ago
                                                                                                                                    might be region specific, but I can use it from my Google home devices. less useful when pandemic hit.
                                                                                                                                    • johnghanks 4 years ago
                                                                                                                                      It's in use on Pixel phones -- you can use it to screen suspected spam callers. I think the more advanced version, the one that called restaurants on your behalf, was canned or sent back to the drawing board because too many people were automatically hanging up on anything resembling a robot.
                                                                                                                                    • roca 4 years ago
                                                                                                                                      Their hiking question is an odd example. Technology like this is probably perfectly fine for asking questions with low downside for wrong answers. But if someone asks "I've hiked Mt Pirongia and now I want to hike Mt Taranaki; how do I need to prepare differently?" and Google erroneously answers "nothing", that could get someone killed.
                                                                                                                                      • xapata 4 years ago
                                                                                                                                        Are you suggesting that's a reason to not do this research?
                                                                                                                                        • roca 4 years ago
                                                                                                                                          Not at all. I'm suggesting that when writing up a PR blog post, choose examples where applying your technology is a sensible and safe thing to do.
                                                                                                                                          • xapata 4 years ago
                                                                                                                                            That makes sense. What would have been a better example?
                                                                                                                                      • ljm 4 years ago
                                                                                                                                        An AI named after the British diminutive for 'mother' is surely a wise choice. I would not trust this AI unless it kissed my forehead and tucked me into bed.
                                                                                                                                        • drdeca 4 years ago
                                                                                                                                          I'm reminded of the parody search engine/character named "MOM" depicted in the tower-building game "World of Goo". She promises to make lots of cookies and offers to send emails with many promotional offers.
                                                                                                                                          • dbuder 4 years ago
                                                                                                                                            You will do as your MUM says. Mum knows best. Yyou will eat the bugs and you will like it.
                                                                                                                                            • ColinHayhurst 4 years ago
                                                                                                                                              • moritonal 4 years ago
                                                                                                                                                “When in trouble come to Mum, Mum will do your little sum”

                                                                                                                                                Don't know if it's related, but the above is Arup’s speech for the computer he christened Mumbo-Jumbo.

                                                                                                                                                • mark_l_watson 4 years ago
                                                                                                                                                  My first thought was comparing to “Mother” in the book/movie Alien.
                                                                                                                                                  • cblconfederate 4 years ago
                                                                                                                                                    it's just temporary until they perfect DADDY
                                                                                                                                                    • bobthechef 4 years ago
                                                                                                                                                      In this day and age? Not likely. Daddy's turned into a eunuch. Mum's in charge now. There, there... Come to Big Mum.
                                                                                                                                                  • sjg007 4 years ago
                                                                                                                                                    A lot of knowledge on the internet is just wrong. Also a lot of scientific progress is driven by folks persisting against the current dogma. So that seems like a big problem. I imagine this is true for almost any subject where there is tribal domain expertise.
                                                                                                                                                    • atemerev 4 years ago
                                                                                                                                                      ...and still, Google Suggestions cannot understand that in Switzerland, some population do not speak German (e.g. here in Geneva, we are a trilingual country), and only shows me search completion in German (from the browser search bar). And there is no way to change language there. I would prefer English.
                                                                                                                                                      • aembleton 4 years ago
                                                                                                                                                        what is your `Accept-Language` header set to?
                                                                                                                                                    • benjaminjosephw 4 years ago
                                                                                                                                                      This isn't "better search" it's entrenched market domination from the only player with enough smarts, data and (crucially) users to make this work.

                                                                                                                                                      While Google is building a bigger and "better" Behemoth we should ask if this kind of innovation is really doing anything at all to make the world a better place in a meaningful way. Better monetization of search seems like a way to make the world worse in my opinion.

                                                                                                                                                      • fassssst 4 years ago
                                                                                                                                                        I love how the example is a problem only a rich techie would have.
                                                                                                                                                        • phpsuks 4 years ago
                                                                                                                                                          The examples are created by non-tech people
                                                                                                                                                        • d--b 4 years ago
                                                                                                                                                          There is no doubt that given the current state of AI, these requests would produce bullshit answers. AI is just not capable of constructing the proper conceptual models for now. But it sure can give you some answers.

                                                                                                                                                          It's sad to see that they'll be spending so much time, effort and money on this...

                                                                                                                                                          • raybb 4 years ago
                                                                                                                                                            Edit:

                                                                                                                                                            "Google MUM MultiTask Unified Model Introduction" https://youtu.be/s7t4lLgINyo

                                                                                                                                                            I originally posted the LaMDA video: https://youtu.be/aUSSfo5nCdM

                                                                                                                                                            • aledalgrande 4 years ago
                                                                                                                                                              Even in the video he is just citing the same content of the article.
                                                                                                                                                              • tachyonbeam 4 years ago
                                                                                                                                                                This video is so silicon valley, it's amazing. They've obviously spent a lot of money producing it, but it's all vague claims, there isn't even a compelling demo. I'm guessing they're aiming for an audience of mainstream journalists, but they're not actually launching a new product per-se. What gives? Why are they trying to hype something that's not ready, isn't going to be released as a product, and that they're not willing to properly showcase or even explain at any level of detail?
                                                                                                                                                              • 4 years ago
                                                                                                                                                              • aphextron 4 years ago
                                                                                                                                                                >Take this scenario: You’ve hiked Mt. Adams. Now you want to hike Mt. Fuji next fall, and you want to know what to do differently to prepare.

                                                                                                                                                                Ah yes, that totally common scenario which I'm faced with all the time.

                                                                                                                                                                I love this. It perfectly illustrates the peril we are in with the current state of AI research. That the author would choose this as a problem to solve shows exactly the socioeconomic class they come from, and how that influences the way they solve problems. It may seem like a trivial and meaningless example, but these subtle biases will creep their way into these systems and be amplified. And you can bet that this kind of work is the foundation for what will become the technology that eventually governs every facet of our lives once AGI is a thing.

                                                                                                                                                                I, for one, am terrified of the implications that a bougie tech bro AI overlord entails.

                                                                                                                                                                • ergot_vacation 4 years ago
                                                                                                                                                                  No need to resort to goofy phrases like "bougie tech bro." They're just out-of-touch rich people. Same as it ever was.

                                                                                                                                                                  If that's your concern though, the good news is that in its purest form, machine learning tends to bend AWAY from this. You need large data sets to get good results, which means these projects tend to sample huge chunks of the general Internet, not just the isolated bubbles of SV types. Of course this still has limits, any data set has limits. You can only scrape data from the net if someone has posted that data in the first place, for example.

                                                                                                                                                                  But in their initial form, a lot of these models are pretty diverse. That's why AI Dungeon had all kinds of "objectionable" content that kept getting the always-offended on their case: GPT-3 is just built off the general Internet, including a lot of weird, fucked up shit. The real problem is that inevitably someone complains, and they start hacking away at the ideal model to try to make it squeaky clean and ruin it in the process.

                                                                                                                                                                  If you want to keep the tech from being perverted by "bougie tech bros," focus on the censorship. The models often start off pretty good.

                                                                                                                                                                  • nexuist 4 years ago
                                                                                                                                                                    I am surprised with how many people in this thread are equating mountain climbing with techbro culture. Really? How are those related? The fact that some techbros climb some mountains for fun?

                                                                                                                                                                    How about the millions of people in rural counties and developing countries without access to vehicles who rely on walking across difficult terrain to make deliveries / get to work / get to school / visit family? Are they also techbros? My grandfather was an electrician in Albania and he would regularly walk dozens of miles on foot including through mountain ranges in order to get between jobs. Granted, this was dozens of years ago, but there's no reason to believe there isn't someone doing the same thing today.

                                                                                                                                                                    If anything your own upper middle class bias is showing here, because you assume that everyone who navigates terrain is doing so for fun and not because they don't have other options.

                                                                                                                                                                    • babesh 4 years ago
                                                                                                                                                                      Generally, it is the upper middle class that travel to different countries to go hiking. The lower class aren’t traveling to Japan to hike Mt Fuji. Also, hiking Mt. Fuji requires some care.

                                                                                                                                                                      https://www.thesun.co.uk/news/10248155/climber-livestreams-d...

                                                                                                                                                                      Indoor climbing is definitely a SF techie thing. Tons of tech people climbed at Mission Cliffs.

                                                                                                                                                                      • bobthechef 4 years ago
                                                                                                                                                                        It's a behavioral shibboleth. They don't hike for pleasure or genuine reasons. They hike so that they can post a picture on Instagram. It's just a thing you do as part of the bland, petty, superficial, materialistic upper middle class bubble.
                                                                                                                                                                      • selfhoster11 4 years ago
                                                                                                                                                                        Researching, travelling to and from the mountain, buying and maintaining equipment, and getting training for mountain climbing all cost time and money. Techbros have at least the latter in great abundance.
                                                                                                                                                                      • des1nderlase 4 years ago
                                                                                                                                                                        We are all, in general, smart. It's the data points from our surrounding that differentiate our opinions.
                                                                                                                                                                    • 1vuio0pswjnm7 4 years ago
                                                                                                                                                                      Some Googler or Google fan replied to me yesterday with, "Sheesh. Why the FUD."

                                                                                                                                                                      Ask MUM.

                                                                                                                                                                      • lifeisstillgood 4 years ago
                                                                                                                                                                        A bit off topic but I am wondering if there are open knowledge graphs in public?

                                                                                                                                                                        Ignoring AI etc, my kids play a couple of games where there is clearly some backend that "knows" Taylor Swift is a Singer, is Female, and has acted in this movie X

                                                                                                                                                                        You can go a long way in a Turing test with that and I was wondering if folks knew where those graphs were built ?

                                                                                                                                                                      • sjg007 4 years ago
                                                                                                                                                                        Makes sense. I want insights and context. If Google can do that synthesis that’s great. I do wonder about the training data and data quality though. When I do these targeted searches you have to filter the spam... books are somewhat better but nothing beats talking to someone who lives it or did it.
                                                                                                                                                                        • robkop 4 years ago
                                                                                                                                                                          I can't see any link to an actual paper, anyone know if they released one for this?
                                                                                                                                                                        • Lyapunov_Lover 4 years ago
                                                                                                                                                                          I see a lot of people here expressing doubts and confusion. I want to try to clear up some of that.

                                                                                                                                                                          The key notion here is scale relativity. This is the reason why transformer models have been so, well, transformative. Bigger models are better than smaller models in a proportional manner. That is, they display scale relativity. Where is the limit? Where does this break down? We don't know. We haven't found the ceiling yet.

                                                                                                                                                                          Another important notion is multimodality. When you can cross-reference your text-based knowledge of an apple with your image-based knowledge of an apple, you can use this information as leverage. Archimedes said, "Give me a place to stand on, and I'll move the Earth." It might seem ridiculous to say that the same is true when it comes to information, but it is. Informational leverage is powerful. Multimodality allows you to make very accurate predictions. The McGurk effect is a nice demonstration of how we do the exact same thing. We rely on visual information from a speaker's lips to predict what they're going to say. In other words: we make use of multimodal leverage.

                                                                                                                                                                          The twin notions of scale relativity and multimodality explain what makes MUM possible. As some of you have pointed out, there's another aspect that we can't ignore: utility. Google will be using MUM to make money. Which means that they'll have to train MUM to make you spend it. But if you're uncomfortable with this idea, you are uncomfortable with capitalism in general. Which is fair, but I think it's important to keep it in mind.

                                                                                                                                                                          As I'm sure they've already considered at Google, MUM can be used to revolutionize education. Imagine people all over the world having access to an expert instructor who can answer all of your questions. You might think this sounds like a dream, but we're a mere stone toss away from achieving it. That's the true power of scale relativity + multimodality: we can now make advanced systems that can communicate with us.

                                                                                                                                                                          I appreciate the skeptics and naysayers here: you keep the rest of us sane. For that, I thank you. At the same time, I want you to open your eyes to the possibility that something very important and transformative is happening right now. You don't have to go full Kurzweil, but I think you would benefit from reflecting on the opportunities this new technology might offer.

                                                                                                                                                                          • Ajedi32 4 years ago
                                                                                                                                                                            Yeah, I'm a little surprised at all the negativity here considering the game-changing potential of this sort of research. The HN crowd has always been a pretty cynical bunch, but come on! A single model that can extract information from images, text, and webpages across multiple languages and generate answers in response to natural language questions written by a user? This feels like straight-up wizardry!
                                                                                                                                                                          • endisneigh 4 years ago
                                                                                                                                                                            I find it difficult that Google wants search to be easier for the end user - for example I believe a very long time ago you could setup sites to exclude from all of your searches - I don’t think this is possible any longer.
                                                                                                                                                                            • 4 years ago
                                                                                                                                                                              • tinyhouse 4 years ago
                                                                                                                                                                                As usual, a lot of AI hype from Google.
                                                                                                                                                                                • 42droids 4 years ago
                                                                                                                                                                                  "Since MUM can surface insights based on its deep knowledge of the world" Which just means taken from the millions of websites written by humans and used without permission or any payment.
                                                                                                                                                                                  • alcover 4 years ago

                                                                                                                                                                                      "Is there any work left to be done?"
                                                                                                                                                                                    
                                                                                                                                                                                    The short answer is an emphatic “Yes! Dismantling your monster of a corporation!”
                                                                                                                                                                                    • 4 years ago
                                                                                                                                                                                      • sboomer 4 years ago
                                                                                                                                                                                        Any millennial who is using search for some time would easily know where to find what he needs. This sounds like Google is trying hard to drive more money out of its search business.
                                                                                                                                                                                        • aaron695 4 years ago
                                                                                                                                                                                          > "Is there any work left to be done?"

                                                                                                                                                                                          Google could search captions on all the Youtube (etc) videos. Not sure why this doesn't happen. Along with a few other big resources not indexed.

                                                                                                                                                                                          I think the big thing with the article(Taken as a workable technology) is it's not search, it's getting other peoples information and transforming it into a Google resource.

                                                                                                                                                                                          Which does add to humanities knowledge, but it's owned and profited on by Google.

                                                                                                                                                                                          • SiempreViernes 4 years ago
                                                                                                                                                                                            When the text starts with "Is there any work left to be done?" The short answer is an emphatic “Yes!” I was sort of hoping they would announce that pinterest will now be banned from all non-image search results...

                                                                                                                                                                                            Instead it's an announcement that Google has made a new, even bigger, pile of linear algebra that can sort of answer questions and won't end up like Watson.

                                                                                                                                                                                            I like that they put in a deadpan bit about how they are very ethical when they make and then exploit their huge collections of data found by their spiders. There sure hasn't been any AI controversy at google this quarter, no sir-e!

                                                                                                                                                                                            • gerdesj 4 years ago
                                                                                                                                                                                              "When I tell people I work on Google Search, I’m sometimes asked, "Is there any work left to be done?" The short answer is an emphatic “Yes!”

                                                                                                                                                                                              Hands up everyone who is 100% satisfied with Search ... ... OK no one.

                                                                                                                                                                                              So now we have an unsolved problem left behind in favour of ... chat about mountains ...

                                                                                                                                                                                              "MUM has the potential to transform how Google helps you with complex tasks. Like BERT, MUM is built on a Transformer architecture, but it’s 1,000 times more powerful. MUM not only understands language, but also generates it."

                                                                                                                                                                                              Piss off and while you are at it, get BERT to explain my response to MUM or vice versa.

                                                                                                                                                                                              If MUM can decipher my immediately prior sentence given this input then I might start to get interested.

                                                                                                                                                                                              • rabbits77 4 years ago
                                                                                                                                                                                                There is nothing in that press release that could not have been done in the 1980s with Prolog.

                                                                                                                                                                                                Yeah, it’d have been more code but you would not have needed to destroy a forest to train the thing.

                                                                                                                                                                                                This is the NLP trade off of the 21st century. The code is easier to write but the model is completely opaque, and you need to really burn a lot of electricity to make it work.

                                                                                                                                                                                                • xkapastel 4 years ago
                                                                                                                                                                                                  This is totally false, I dare you to write anything close to e.g. BERT with Prolog.
                                                                                                                                                                                                  • kajecounterhack 4 years ago
                                                                                                                                                                                                    > This is the NLP trade off of the 21st century. The code is easier to write but the model is completely opaque, and you need to really burn a lot of electricity to make it work.

                                                                                                                                                                                                    This is basically a meme now. We actually have a pretty good understanding of how the models work. In fact that understanding is how you can do things like build chatbots that don't spew hate.

                                                                                                                                                                                                    Also the electrical cost of ML training large language models is indeed high (e.g GPT-3 has 175B params and is estimated at 190,000 kWh to train on GPUs). But the folks who pay the cost (Basically OpenAI, Google, MSFT, Facebook, Amazon) are incentivized to make that go down (TPUs are way more efficient than GPUs), and they are incentivized to do it infrequently because it costs $$$.

                                                                                                                                                                                                    FWIW Google's datacenters are also technically carbon neutral. I know that's not great because carbon credits don't have the impact that folks think they have, but there is definitely a difference in ecological impact from datacenter electricity and other kinds of energy usage (e.g cars all burning fossil fuels).

                                                                                                                                                                                                    Okay also let's compare to bitcoin, which is the real ecological disaster if we want to talk about inefficient software: ~387,096,774 kWh PER DAY. _and_ incentivizing things like cheap coal, and miners are definitely not using their crypto wealth to purchase carbon offset credits :(

                                                                                                                                                                                                    • smokefoot 4 years ago
                                                                                                                                                                                                      I mean yes. But it is a funny example to choose to illustrate the power of a NN approach. They're talking about mountains--an entity that has very concrete and definable attributes (e.g., height). And the rest of the examples are similarly dealing with semi-structured data that could theoretically be represented in RDF or something like that.

                                                                                                                                                                                                      There's been a bit of discussion on HN lately about the effectiveness of sophisticated models vs. just good metadata.

                                                                                                                                                                                                      • h0l0cube 4 years ago
                                                                                                                                                                                                        The better wager is: I dare you to write and train up a sophisticated real-time neural network model that can interpret human language and provide reliably useful contextual search results with the compute power and memory constraints of the 80s.
                                                                                                                                                                                                        • ulber 4 years ago
                                                                                                                                                                                                          Why would anyone take that wager? I see no reason to believe that's possible with either Prolog or NNs when you're restricted to 80s hardware.
                                                                                                                                                                                                        • rabbits77 4 years ago
                                                                                                                                                                                                          You know nothing of Prolog, obviously.

                                                                                                                                                                                                          But whatever I get it. Make the GPU go brrrr.

                                                                                                                                                                                                          Oh, and most of the NLU of IBM Watson is Prolog.