Brave Leo now uses Mixtral 8x7B as default
244 points by b_mc2 1 year ago | 178 comments- rhdunn 1 year agoIf you want to run Mixtral 8x7B locally you can use llama.cpp (including with any of the supporting libraries/interfaces such as text-generation-webui) with https://huggingface.co/TheBloke/Nous-Hermes-2-Mixtral-8x7B-S....
The smallest quantized version (2bit) needs 20GB of RAM (which can be offloaded onto the VRAM of a decent 4090 GPU). The 4bit quantized versions are the largest models that can just about fit onto a 32GB system (29GB-31B). The 6bit (41GB) and 8bit (52GB) models need a 64GB system. You would need multiple GPUs with shared memory if you wanted to offload the higher precision models to VRAM.
I've experimented with the 7B and 13B models, but haven't experimented with these models yet, nor other larger models.
- viraptor 1 year agoAnd if you want better performance when talking about code, you can try the dolphin-mixtral fine tuning https://huggingface.co/TheBloke/dolphin-2.7-mixtral-8x7b-GGU...
- tarruda 1 year ago> You would need multiple GPUs with shared memory if you wanted to offload the higher precision models to VRAM.
Or just a powerful apple silicon machine? I've tried dolphin mixtral 4bit on a 36gb ram MacBook m3, and inference is super fast.
- mkesper 1 year agoOr a Linux machine with a Ryzen using the internal GPU and the unified RAM (scroll down at llama.cpp and look for ROCm).
- jacooper 1 year agoWait ROCm support Ryzen APUs and still doesn't support dedicatedly GPUs like the 6700XT?!
- KeplerBoy 1 year agoOr a jetson orin agx (~2k$). Probably the cheapest way to get an Nvidia GPU with 64 GB of RAM.
- stavros 1 year agoWhat? I can do this? Runs to the PC
EDIT: I cannot, I need to install ROCm to compile with it, and then install something called hipBLAS, and who knows what else.
- dimask 1 year agoHow much RAM are you able to set aside for a ryzen igpu?
- ode 1 year agoHow fast is it with a setup like this?
- jacooper 1 year ago
- pennaMan 1 year agoI can run 4bit on a beat up 1070 ti. GP talks about higher precision models
- sp332 1 year agoYou wouldn’t be able to fit the whole model into 8GB VRAM. It’s faster than not using a GPU at all, but most of it would still be computed on the CPU.
- sp332 1 year ago
- asimpleusecase 1 year agoDid you do anything special to make that work? Is it useful? Or just a toy?
- windexh8er 1 year agoI have a 14" MBP with an M1 Max and 64GB. The M3 won't really make a difference, but the RAM, since unified, is huge. I can run most models on this machine with realtime performance compared to a Ryzen 7735HS and 64GB (DDR5). Now I'm not saying the Ryzen setup should be good, but the M1 architecture just makes it a much better option. I could add an eGPU to the Ryzen system and it could likely do better, but would also exceed the price point and portability.
- brucethemoose2 1 year agoI've had less luck with Mixtral, but I run Yi 34B finetunes for general personal use, including quick queries for work.
Its kinda like GPT 3.5, with no internet access and slightly less reliable responses, but unrestrained, much faster and with a huge (up to 75K on my Nvidia 3090) usable context.
Mixtral is extremely fast though, at least at a batch size of 1.
- tarruda 1 year agoNothing special other than llama.cpp, which is an inference engine optimized for apple silicon.
I heard you can simply install ollama app which uses llama.cpp under the hoods, but has a more user friendly experience.
- EarthLaunch 1 year agoI've been using it for 'easy' queries like syntax/parameter questions, in place of ChatGPT 4. It's great for that. I am using a ~48GB version.
- windexh8er 1 year ago
- mkesper 1 year ago
- bearjaws 1 year ago2bit is pretty damn terrible, I don't recommend it for anything serious.
- ignoramous 1 year agoAt that level of quantization / distillation, smaller models like phi-2 (q&a) and wavecoder-6.7b (code-gen) might be preferable over QLoRAd ones: https://huggingface.co/microsoft/phi-2
> 2bit is pretty damn terrible
Wait till you go hybrid [0] or even 1bit [1]
- ignoramous 1 year ago
- jodleif 1 year agoI prefer koboldcpp over llama.cpp. It’s easy to spilt between gpu/cpu on models larger than VRAM
- sp332 1 year agoLlama.cpp has --n-gpu-layers that lets you set how much of the model to put on the GPU.
- DrSiemer 1 year agoRuns in Oobabooga textUi as well, if you add the llama.cpp extension. Easier interface imo, plus fun stuff like coqui and whisper integration.
- rhdunn 1 year agoThat's interesting. It also looks like koboldcpp works better with long interactions, as it only processes changed tokens. I'm using llama.cpp with text-generation-webui and its OpenAI compatible API. I'll have to look to see if I can use koboldcpp with it.
- sp332 1 year agoLlama.cpp has an interactive mode, but I don't think text-generation-webui uses it. https://github.com/ggerganov/llama.cpp/blob/master/examples/...
- jodleif 1 year agoIndeed. Koboldcpp works fine with other UIs than the bundled one.
- sp332 1 year ago
- magicalhippo 1 year agoI've got an aging 2080Ti and Ryzen 3800X with 96GB RAM, any point in trying to mess with the GPU or?
Haven't really been able to justify upgrading to a 4090 or similar given I play so few new games these days.
- htsh 1 year agoYes, offloading some layers to the GPU and VRAM should still help. And 11gb isn't bad.
If you're on linux or wsl2, I would run oobabooga with --verbose. Load a GGUF, start with a small number of GPU layers and creep up, keeping an eye on VRAM usage.
If you're on windows, you can try out LM Studio and fiddle with layers while you monitor VRAM usage, though windows may be doing some weird stuff sharing ram.
Would be curious to see the diffs. Specifically if there's a complexity tax in offloading that makes the CPU-alone faster but in my experience with a 3060 and a mobile 3080, offloading what I can makes a big diff.
- baq 1 year agoI had only cuda installed and it took 2 ollama shell commands in WSL2 from quite literally 0 local LLM experience to running mixtral fast enough on a 1070 and 12700k. Go for it.
- htsh 1 year ago
- attentive 1 year agokobold bundles and runs llama.cpp. So it should be fairly the same with convenient defaults.
- sp332 1 year ago
- DreamGen 1 year agoWhen talking about memory requirements one also needs to mention the sequence length. In case of Mixtral, which supports 32000 tokens, this can be a significant chunk of the memory used.
- chpatrick 1 year ago`ollama run mixtral:8x7b-instruct-v0.1-q3_K_L` works fast on my 3090 locally
- beck5 1 year agoDumb question, but how can a 32 bit number be converted to 2 bits and still be useful? It seems like magic.
- regularfry 1 year agoMixtral and others are often distributed as 16-bit floats, so that chops the problem in half immediately, but then it turns out that LLMs only have about four bits per parameter of actual information stored. There's a lot of redundancy. The ideal quantisation scheme would only throw away useless data, but no quantisation scheme is perfect so they inevitably harm the model somehow.
You've then got to remember that one thing neural networks are very, very good at is being noise tolerant. In some senses that's all they are - noise correction systems. The inaccuracies introduced by quantisation are "just" a sort of noise, so it's not surprising that they aren't fatal. It just raises the noise floor and gives the model more ways to be wrong.
Finally the thing to know is that these quantisation schemes don't do a naive "chop each number down to two bits", not exactly. Simplifying a bit, for each parameter in this example they'd try to find a mapping from a two-bit index into a four element lookup table of higher-precision values such that the information destroyed by replacing the original parameter by the lookup value is minimised. That mapping is calculated across small blocks of parameters, rather than across the entire model, so it can preserve local detail. The lookup table gets stored per block, which throws the compression ratio off a little.
- DougBTX 1 year agoNice graphs here: https://github.com/ggerganov/llama.cpp/pull/1684
So for example, 2 bit version of the 30B is much worse than the original, but still better than the 13B model.
Also, there are lots of extra details, eg, not all of the weights are 2 bit, and even the 2 bit weights are higher than that overall as groups of quantised weights share scale factors stored elsewhere.
- beefield 1 year agoI think of it with this kind of analogy: the original image is stored with 32 bit color scheme. You can reduce the color scheme to 16 bit accuracy and still figure out pretty well what the image is about. 2 bit is stretching this to a bit far, basically either pixel is white or it is black, but even if you lose lots of nuances in the image, in many images even that gives you some idea whats going on in the image.
- DougBTX 1 year agoThat’s an interesting question, I wonder if there is an analogy in quantisation to image dithering?
- 1 year ago
- DougBTX 1 year ago
- hnfong 1 year agoThis blog post might shed some light on the matter. If I'm understanding it correctly, it claims there are emergent features on the LLM weights that make it easier to "compress" the floats into smaller bits without losing much precision.
https://timdettmers.com/2022/08/17/llm-int8-and-emergent-fea...
Note that 2 bit quantization is generally regarded as too aggressive. Generally 4bits+ achieves a good tradeoff, see eg. https://arxiv.org/abs/2212.09720
- brucethemoose2 1 year agoIts not really 2 bits.
Modern quantization schemes are almost like lossy compression algorithms, and llms in particular are very "sparse" and amenable to compression.
- ttoinou 1 year agoAll the 32 bits weren't necessarily used, and it's the whole network itself that has to be useful. It's a tradeoff. We started with very good precision to test the new method, now we can optimize some parts of it
- Const-me 1 year agoHere’s an example of a custom 4 bits/weight codec for ML weights:
https://github.com/Const-me/Cgml/blob/master/Readme.md#bcml1...
llama.cpp does it slightly differently but still, AFAIK their quantized data formats are conceptually similar to my codec.
- sp332 1 year agoThe extra precision is more useful for training. Once the network is optimized, it's a statistical model and only needs enough precision to make good guesses. In fact, one of the big papers on this also pointed out that you can drop about 40% of the weights completely. I think people generally skip that part because sparse matrix operations are slower, so it doesn’t help here.
- viraptor 1 year agoFor models with dropped weights, the keyword is "distilled". For example ssd-1b is a 50% size version of Stable Diffusion XL (https://huggingface.co/segmind/SSD-1B)
- viraptor 1 year ago
- regularfry 1 year ago
- EVa5I7bHFq9mnYK 1 year agoFaraday.dev has it in its selection of models now. Good for us clueless Windows folks. Runs decently fast with 16gb mobile 3080 gpu. Results seem better than any other free option.
- thriw63748 1 year agoWhy not normal RAM? Ryzen 5600 with 128GB DDR4 is perfectly fine to run mixtral 8bit, and costs less than $1000.
GPUs are only needed if you can not wait 5 minutes for an answer, or for training.
- snowfield 1 year agoOr if you want multiple sessions at the same time. Or if you want to do anything else with your machine while it's running.
But realistically, 5 minutes is too long. It should be conversational, and for that you need at least 5 tokens per second. Which your Ryzen just can't do.
- MPSimmons 1 year ago>It should be conversational, and for that you need at least 5 tokens per second.
To be fair, a lot of people are using this for non-interactive work, like batching document analysis or offline processing of user generated content.
- MPSimmons 1 year ago
- irusensei 1 year agoWhy not both? Llama.cpp allows layering GGUF models between GPU and CPU memory.
- dragonwriter 1 year ago> GPUs are only needed if you can not wait 5 minutes for an answer
Yeah, but that's generally true (or at least, “5 minutes for an answer is very suboptimal”, even if “can’t” isn’t quite true) for interactive use cases, which are... a lot of LLM use cases.
- juliangoldsmith 1 year agoNot sure why you're getting downvoted. It performs decent enough on my Ryzen 3600X with 64GB of RAM. It definitely wouldn't be usable for production or fine-tuning, but it's fine for experimenting.
- brucethemoose2 1 year ago> perfectly fine
Only for very short context and responses.
Beyond that, the performance is painful.
- rhdunn 1 year agoThat was what I was referring to with the 32/64 GB systems.
- SkyMarshal 1 year agoWhat's the bandwidth between the Ryzen and that DDR4?
- snowfield 1 year ago
- MuffinFlavored 1 year agoWhat differences would I measurably notice running the 2-bit version vs the 4-bit version vs the 6-bit vs the 8-bit?
- zaat 1 year agoUsing TheBloke nous-hermes-2-mixtral-8x7b-dpo.Q5_K_M.gguf, llam.cpp with oobabooga, on Ryzen 9 7950x and RTX 2070 super, offloading 4 layers to the GPU.
27,631MiB system memory used.
7358/8192MiB usage on the GPU.
I'm getting 6.32 tokens/second.
You: You are Hunter S. Thompson, write a 450 words essay on the LHC project. The writing must be in Thompson's style
AI: Well, buckle up, kiddos, because we're about to dive into the twisted world of particle physics and the Large Hadron Collider, or as I like to call it, the Doomsday Machine. This beast of a thing is buried deep beneath the Swiss countryside, a 17-mile-long ring of death and destruction that makes the atom bomb look like a goddamn firecracker.
Now, let me tell you, when the Swiss government first proposed this monstrosity, I thought they'd gone completely batshit crazy. But then again, this is the land of Toblerone and Heidi, so maybe I shouldn't be too surprised. The LHC is essentially a giant, underground racetrack for subatomic particles, where they're accelerated to nearly the speed of light before being smashed into each other like a couple of drunken frat boys at a kegger.
The goal of this madness? To recreate the conditions that existed just after the Big Bang, when the universe was nothing more than a seething cauldron of energy and chaos. And if that doesn't sound like a recipe for disaster, I don't know what does.
But let's not forget the real reason we're all here: the Higgs boson, or as it's more commonly known, the God Particle. This little bugger is what gives other particles their mass, and without it, we'd all be floating around like a bunch of cosmic tumbleweeds. The LHC was designed to find this elusive particle, and in 2012, they finally did it. Sort of.
You see, the Higgs boson they found was a bit of a disappointment. It was too light, too wimpy, and didn't behave quite the way they expected it to. So now, the physicists are back at the drawing board, trying to figure out what the hell went wrong. And in the meantime, the LHC continues to churn out more and more data, like some kind of unholy particle-smashing machine.
But let's not forget the dark side of the LHC. There are those who believe that this thing could very well be the end of the world as we know it. They fear that the LHC could create a black hole, or even a miniature Big Bang, right here on Earth. And while the scientists assure us that these fears are unfounded, I can't help but wonder if they're just whistling past the graveyard.
So, what have we learned from all this? Well, for one thing, we've discovered that the universe is a much stranger place than we ever imagined. We've also learned that even the most brilliant minds can be blinded by their own ambition, and that sometimes, the pursuit of knowledge can lead us down some very dark paths.
In the end, the LHC is a testament to the insatiable curiosity of the human race, and our unquenchable thirst for knowledge. It's a monument to the power of science, and the potential consequences of playing God. And as we continue to push the boundaries of what's possible, we must always remember that with great power comes great responsibility. Or, as I like to say, "When the going gets weird, the weird turn pro."
- tacoman 1 year agoI am using the exact same model. Ryzen 5600G w/32GB and an Nvidia P40 w/24GB VRAM
20/33 layers offloaded to GPU, 4K context. Uses 25GB system RAM and all 24GB VRAM. 5-7 tokens per second.
- zaat 1 year agoContext is set to 32768, I didn't change it I guess that's the model's default.
Thanks for making me feel better about investing in tht motherboard + CPU + RAM upgrade and deferring the GPU upgrade.
- attentive 1 year agoand Groq does 485.08 T/s on mixtral 8x7B-32k
I am not sure local models have any future other than POC/research. Depends on the cost of course.
- zaat 1 year ago
- tacoman 1 year ago
- viraptor 1 year ago
- wolverine876 1 year agoKudos to Brave (for this and other privacy features):
Unlinkable subscription: If you sign up for Leo Premium, you’re issued unlinkable tokens that validate your subscription when using Leo. This means that Brave can never connect your purchase details with your usage of the product, an extra step that ensures your activity is private to you and only you. The email you used to create your account is unlinkable to your day-to-day use of Leo, making this a uniquely private credentialing experience.
- quinncom 1 year agoThis is very cool, and something I’d like to integrate in my own apps. Does anybody know how this works exactly, not using foreign keys?
- luke-stanley 1 year agoI could guess, an "anonymous payment credential service" could do something like this: 1. User completes payment for the paid for service, 2. To track the payment entitlement, a random, unique ID is generated by the service for the user, that is not related to any of their data. 3. This ID is saved in a database as a valid payment key. 4. The database records IDs in shuffled batches, or with semi-random fuzzy / low resolution timestamps to prevent correlation between payment time and ID generation. 5. Each ID has an entitlement limit or usage stopping point, ensuring it's only valid for the subscribed period.
Another way might be Zero-Knowledge Proofs (ZKPs), but that might be more complex. They might even use their BAT crypto stuff for this somehow, I suppose. Whatever solution, would need a fundamental solution for how to avoid correlation, I think.
- luke-stanley 1 year ago
- quinncom 1 year ago
- syntaxing 1 year agoInteresting, I must have missed the first Leo announcement. I really like how privacy conscious it is. They don’t store any chat record which is what I want.
- Dwedit 1 year agoThere is no way to confirm that claim, just like there is no way to confirm that a VPN service is "no log".
- lolinder 1 year agoYes, at some point if you're going to interface with other humans you will eventually just have to trust their word.
For some people's threat models that isn't good enough, but for the vast majority of people—people who aren't being pursued by state intelligence agencies but who are squeamish about how much data a company like Google collects—a pinky promise from Brave or Mullvad is good enough.
- wolverine876 1 year ago> For some people's threat models that isn't good enough, but for the vast majority of people—people who aren't being pursued by state intelligence agencies but who are squeamish about how much data a company like Google collects—a pinky promise from Brave or Mullvad is good enough.
Who are you to say it's good enough (and ridicule people who disagree)? We don't have too much evidence of it, because they have very few options and of course most people are not informed and lack the expertise to understand the issues (a good situation for regulation). At one point lots of people used lead paint and were fine with it; they would have told us.
> Yes, at some point if you're going to interface with other humans you will eventually just have to trust their word.
There's technology, such as the authorization tokens used by Brave, that reduces that risk. Of course, no risk can be complete eliminated but that doesn't mean we shouldn't reduce it.
- bcye 1 year agoI would like to think GDPR ensures this pinky promise is good enough
- wolverine876 1 year ago
- Erratic6576 1 year agoYou gotta trust them by their word
- lolinder 1 year ago
- Tanath 1 year agoIt's foolish to trust Brave. https://www.spacebar.news/p/stop-using-brave-browser
- Dwedit 1 year ago
- firtoz 1 year agoWhat are good API providers that serve mixtral? I know only octo ai which seems decent but will be good to know alternatives too
- jeanloolz 1 year agoThe creators of the model actually have their own platform where you can access this model and others via API: https://console.mistral.ai/
- attentive 1 year agoI just discovered Groq, which does 485.08 T/s on mixtral 8x7B-32k
No idea on pricing but supposedly one can email to api@groq.com
- frozenport 1 year agoI think you can try it online at chat.groq.com
- tome 1 year ago(Groqster here). Yes, you can select Mixtral from the dropdown menu. If anyone has any questions about Groq let me know and I'll do my best to answer!
- tome 1 year ago
- frozenport 1 year ago
- Tiberium 1 year agoOpenRouter is generally a good option (already mentioned), the best part is that you have a unified API for all LLMs, and the pricing is the same as with the providers themselves. Although for OpenAI/Anthropic models they were forced (by the respective companies) to enable filtering for inputs/outputs.
- petercooper 1 year agoBoth already mentioned, but I am using Anyscale Endpoints with great success, very fast and will work on ten jobs at a go out of the box. Together.ai also seems to work fine in my initial tests, but haven't tried it at scale yet.
- mark_l_watson 1 year agoI have used both Mistral’s commercial APIs and also AnyScale’s commercial APIs for mixtral-8-7b- both providers are easy to use.
I also run a 3 bit quantization of mixtral-8-7b on my M2 Pro 32G memory system and it is fairly quick.
It is great having multiple options.
- htsh 1 year agoopenrouter, fireworks, together.
we use openrouter but have had some inconsistency with speed. i hear fireworks is faster, swapping it out soon.
- tome 1 year agoI work for Groq and we serve the fastest available version of Mixtral (by far) and we also have a web chat app. I'll refrain from linking it because it has already been linked and I don't want to spam, but I'm available to answer any questions people have about Groq's hardware and service.
- bearjaws 1 year agoTogether.ai seems to be the best, incredibly fast.
- T_S_ 1 year agoNot so sure about that. Check out https://github.com/ray-project/llmperf-leaderboard
And try mixtral on chat.groq.com
- RupertEisenhart 1 year agoThese guys are much faster than openrouter, and their llama2 runs faster than 3.5-turbo. Amazing work.
- T_S_ 1 year ago
- firejake308 1 year agoI personally like Anyscale Endpoints
- Philpax 1 year agoI've had good experiences with Together, and they have very competitive pricing.
- jug 1 year agoYou also have Replicate: https://replicate.com/mistralai/mixtral-8x7b-instruct-v0.1
- jeanloolz 1 year ago
- frozenport 1 year agoI've been running the version on poe and chat.groq.com for the last week.
Much better than llama 70b.
- charcircuit 1 year agoIt's interesting that they made it so you can ask LLM queries right from the omnibar. I wonder if they eventually will come up with some heuristic to determine if thr query should be sent directly to an LLM or if the query should use the default search provider.
- m3kw9 1 year agoIf you have used gpt4 and then use mistral, it’s like looking at a Retina display and then have to go back to a low res screen. You are always thinking “but GPT4 could do this though”
- mpalmer 1 year agoHave you used mixtral?
- mpalmer 1 year ago
- kristianpaul 1 year agoI run Mixtral locally using ollama
- emmanueloga_ 1 year agoDoes anyone know of a good chrome extension for AI page summarization? I tried a bunch of the top Google search hits, they work fine but are really bloated with superfluous features.
- Terretta 1 year agoSee Kagi's Universal Summarizer
https://kagi.com/summarizer/index.html
https://help.kagi.com/kagi/api/summarizer.html
"Alternatively use Kagi Search browser extension (Chrome/Firefox) and you can use the most advanced Muriel model right from the extension."
- Terretta 1 year ago
- 1 year ago
- andai 1 year agoAsked Mistral 8x7B for an essay on ham. It started telling me about Hamlet.
- Erratic6576 1 year agoIt must start from the beginning. Pig > piglet. Ham > Hamlet
- andai 1 year agoWould make sense if it was the first token. But it's the last, presumably with a "end of user message" separator! (Or perhaps not? I don't know.)
- andai 1 year ago
- Erratic6576 1 year ago
- fifteen1506 1 year agoJust checking: PDF summarization is not yet implemented, right?
- _aaed 1 year agoThe Kagi browser extension can do that, if you're a subscriber
- fifteen1506 1 year agoAsk a PDF? I thought it was only the $25 a month plan.
- _aaed 1 year agoNo, it's just text, like so: https://i.imgur.com/3NMzyDf.png
- _aaed 1 year ago
- fifteen1506 1 year ago
- 1 year ago
- _aaed 1 year ago
- finikytou 1 year agoquick question I have 24GB VRAM and I need to close everything to run MIXTRAL at 4 bit quant with bitsandbyte. there is no way to run it at 3,5 on windows?
- davikr 1 year agoIt's nice using Brave because you have Chromium's better performance, without having to worry about Manifest V2 dying and taking adblocking down with it. I have uBlock Origin enabled, but it has barely caught anything that slipped past the browser filters.
- croes 1 year agoBrave"s support of Manifest V3 is totally dependent on Google and Chrome
>Brave will support uBO and uMatrix so long as Google doesn’t remove underlying V2 code paths (which seem to be needed for Chrome for enterprise support, so should stay in the Chromium open source)
- steve_rambo 1 year agoIt doesn't really matter much in practice. The built-in adblocker (which does not rely on extension APIs) has excellent performance, about as effective as ublock origin, and supports the same filter lists.
- davikr 1 year agoYeah, but the Brave adblocker is built-in, it's not an extension.
- croes 1 year agoThere are more extensions than just ad blockers
- croes 1 year ago
- steve_rambo 1 year ago
- Diti 1 year agoIf by performance you mean browser performance, you have more performance with Firefox nowadays. https://news.ycombinator.com/item?id=36770883
- soundnote 1 year agoOne genuine performance edge Firefox has is that it can handle giant YouTube comments sections better than Chromium browsers in part because it accepts larger maximum page size.
- FoodWThrow 1 year agoI use firefox because it has the most hassle-free hardware decoding in linux. However, everything basically feels better with Brave, even with the same amount of plug-ins.
- kristianp 1 year agoI found the snap update notifications too annoying on Ubuntu, so I tried the ppa. But it the video plugin would crash. So back to Chrome for me.
- kristianp 1 year ago
- jacooper 1 year agoWhile still not being as secure as chromium and still not supporting many advanced features like WebGpu.
- soundnote 1 year ago
- danShumway 1 year agoThis has come up a few times, but as far as I know Gorhill is planning to full-on drop support for Chromium browsers outside of uBO Lite. Does Brave keeping Manifest V2 matter if developers stop maintaining the Chromium version of those extensions?
I also still can't really find if Brave has an extension store or if it's connecting to Chrome's. If it's the latter, then it seems like V2 extensions are going away for Brave regardless of what API decisions it makes, because short of sideloading them the addons won't be on the Chrome web store anymore and won't be getting updates. I assume Brave supports sideloading extensions, and maybe developers would maintain extensions for Chromium that can't actually be used in Chrome? Although that seems a little optimistic. But not having a way to search for V2 extensions or get ratings/reviews seems like it will still be a problem for users.
- Departed7405 1 year agoBrave indeed use the Chrome Web Store.
- Departed7405 1 year ago
- srid 1 year ago> I have uBlock Origin enabled, but it has barely caught anything that slipped past the browser filters.
I have been completely satisfied with Brave's builtin ad-blocker. Does uBlock Origin catch anything in particular that the builtin one does not?
- rozab 1 year agoI couldn't see Brave being able to counter, say, the recent YouTube anti-adblock push as quickly as the open source community did. I could see that kind of stunt becoming more common as Google tries to nail the coffin shut and deny their competitors this USP
- SparkyMcUnicorn 1 year agoBrave's ad blocking is compatible (and uses) the uBlock Origin lists, so you shouldn't really notice a difference between having uBO enabled or disabled in brave.
- 1 year ago
- rozab 1 year ago
- charcircuit 1 year agoMV3 doesn't prevent adblockers from existing.
- rpastuszak 1 year agoIt makes them almost useless in practice.
- HeatrayEnjoyer 1 year agoBecause the filter list is capped, right? Is there a reason the Brave team cannot just remove or increase the cap?
- charcircuit 1 year agoThat is a baseless statement. It doesn't make them useless as they can still block ads.
- HeatrayEnjoyer 1 year ago
- rpastuszak 1 year ago
- deely3 1 year agoIs this submarine comment?
- thebruce87m 1 year agoWhat is the definition of a submarine comment? Google fails and ChatGPT says:
> A "submarine comment" on social media refers to a comment that is made on an old post or thread, long after the conversation has died down. This term derives from the idea of a submarine which remains submerged and out of sight for long periods before suddenly surfacing. In the context of social media, it's when someone delves deep into someone else's posts or timeline, finds an old post, and leaves a comment, bringing the old post back to attention. This can sometimes surprise the original poster and other participants, as the conversation was thought to have been concluded.
Which doesn’t make sense in this context
- mcbuilder 1 year agoI think GP is trying to coin a term for stealth marketing Hacker News comments, except the analogy doesn't really make sense.
- deely3 1 year agoDefinition: https://paulgraham.com/submarine.html
- mcbuilder 1 year ago
- thebruce87m 1 year ago
- 1 year ago
- croes 1 year ago