▲Bamba: An open-source LLM that crosses a transformer with an SSMresearch.ibm.com

97 points by shallow-mind 5 hours ago | 29 comments

adt 2 hours ago [-]

Love those GPQA scores hovering around 5% when chance (on 4-way multi-choice) would have got them 25%!

gryfft 45 minutes ago [-]

A stopped clock is right twice a day, but a running clock set to the wrong time is always wrong.

mh- 4 hours ago [-]

SSM = state-space model, for the unfamiliar.

https://en.wikipedia.org/wiki/State-space_representation

jmward01 3 hours ago [-]

This type of architecture is definitely the future. Unlimited attn is a dead end. As a human you don't need to scan an entire book just to guess what the next word will be and LLMs shouldn't need that either.

quantadev 2 hours ago [-]

Not be contrarian, but if the next word prediction happens to be someone's name or a place or something discussed multiple places in the book then often, yes, a knowledge of the full plot of the book is "required" just to predict the next word, as you get to the middle or end of a book.

For example you could never fill in the last chapter of any good book without having knowledge of every previous chapter. Not highly detailed knowledge, but still knowledge.

mentalgear 55 minutes ago [-]

> chose to make just about everything associated with Bamba open-source — the training recipes, the data, the data loader IBM designed for largescale distributed training, and a quantization framework aimed at shaving storage and inferencing costs.

cubefox 10 minutes ago [-]

Another recent transformer/SSM hybrid is "M1", with a more than 3x claimed inference speed-up compared to equivalent transformers: https://arxiv.org/pdf/2504.10449

IBM is claiming at least a 2x inference speed-up with Bamba. Both groups say that future SSM optimizations to vLLM would lead to further inference speed improvement.

joshjob42 3 hours ago [-]

For some reason this link isn't loading, but it's on https://archive.ph/Ks0xt

aantix 2 hours ago [-]

Where's the code?

beklein 56 minutes ago [-]

I could find these two resources: Hugging Face: https://huggingface.co/collections/ibm-ai-platform/bamba-674... GitHub: https://github.com/foundation-model-stack/bamba

4 hours ago [-]

jwilber 2 hours ago [-]

LLM/state space models have been popular for some years now, see: https://arxiv.org/abs/2212.14052

More recently, hybrid architectures that utilize attention plus other operators are gaining traction.

See https://arxiv.org/abs/2503.01868

antirez 4 hours ago [-]

Dear IBM name pickers: "Bamba", in Italian, means cocaine.

_davide_ 37 minutes ago [-]

When I read the title 'IBM crossed a transformer with an SSM and got ‘Bamba’' I laughed so hard I woke up my kid

alex7o 3 hours ago [-]

It's just a mamba (https://github.com/state-spaces/mamba) but with a transformer. Idk where the B comes from.

rdtsc 1 hours ago [-]

So someone can get fired for picking IBM after all! Or get a bonus, depending on the organization...

iddan 3 hours ago [-]

And in Heberw it's the name of a snack made of peanut-butter-flavored puffed maize https://en.wikipedia.org/wiki/Bamba_(snack)

kridsdale1 2 hours ago [-]

I imported these to America to feed my infant. Data shows the prevalence of peanut allergies lines up with when AAP guidelines started recommending that babies do NOT eat peanut. Israel never went along with this and thus has the lowest rates of allergies in the world.

arijun 2 hours ago [-]

I think the difference in allergy rates between UK and Israeli Ashkenazi Jews (10x higher in UK Jews!) [1] is strong evidence for that.

Also, they sell Bamba at Trader Joe’s now.

[1] https://www.jacionline.org/article/S0091-6749(08)01698-9/ful...

cycomanic 52 minutes ago [-]

Latest research does strongly suggest that introducing small amounts of common allergens (peanuts, shellfish,milk products...) as early as possible does significantly reduce risk for allergies later. Many early childhood organisations already recommend this. Official health recommendations are often slow to catch up (often for good reasons, but introducing peanuts etc. early is already officially recommended in quite a few countries (Australia, NZ, Sweden for example AFAIK). Not all health professionals are always up to date either though.

bonzini 2 hours ago [-]

As an Italian who has tried (only) the Israeli Bamba, I can certify that it is pretty addictive.

amitport 4 hours ago [-]

Maybe?

https://en.m.wikipedia.org/wiki/Bamba_(snack)

;)

akovaski 4 hours ago [-]

https://en.wikipedia.org/wiki/La_Bamba_(song)

dantastic 3 hours ago [-]

Or (where I'm from) a school cafeteria:

https://www.thelocal.se/20221125/swedish-word-of-the-day-bam...

ofrzeta 3 hours ago [-]

Spot on. From the linked blog post "The refrain of La Bamba, the Mexican folk song that Ritchie Valens made famous, goes: Para bailar La Bamba/Se necesita una poca de Gracia. "

vienzo 1 hours ago [-]

And in Lithuanian it's a navel

4 hours ago [-]

rzzzt 4 hours ago [-]

Para bailar La Bamba / Se necesita una poca de gracia

francasso 3 hours ago [-]

SSMs never stop

beanjuiceII 2 hours ago [-]

i mean that sounds good to me

samanator 4 hours ago [-]

Yummy

Loading comments...

adt 2 hours ago [-]

https://lifearchitect.ai/models-table/

Love those GPQA scores hovering around 5% when chance (on 4-way multi-choice) would have got them 25%!

gryfft 45 minutes ago [-]

A stopped clock is right twice a day, but a running clock set to the wrong time is always wrong.

mh- 4 hours ago [-]

SSM = state-space model, for the unfamiliar.

https://en.wikipedia.org/wiki/State-space_representation

jmward01 3 hours ago [-]

quantadev 2 hours ago [-]

For example you could never fill in the last chapter of any good book without having knowledge of every previous chapter. Not highly detailed knowledge, but still knowledge.

mentalgear 55 minutes ago [-]

cubefox 10 minutes ago [-]

Another recent transformer/SSM hybrid is "M1", with a more than 3x claimed inference speed-up compared to equivalent transformers: https://arxiv.org/pdf/2504.10449

IBM is claiming at least a 2x inference speed-up with Bamba. Both groups say that future SSM optimizations to vLLM would lead to further inference speed improvement.

joshjob42 3 hours ago [-]

For some reason this link isn't loading, but it's on https://archive.ph/Ks0xt

aantix 2 hours ago [-]

Where's the code?

beklein 56 minutes ago [-]

I could find these two resources: Hugging Face: https://huggingface.co/collections/ibm-ai-platform/bamba-674... GitHub: https://github.com/foundation-model-stack/bamba

4 hours ago [-]

jwilber 2 hours ago [-]

LLM/state space models have been popular for some years now, see: https://arxiv.org/abs/2212.14052

More recently, hybrid architectures that utilize attention plus other operators are gaining traction.

See https://arxiv.org/abs/2503.01868

antirez 4 hours ago [-]

Dear IBM name pickers: "Bamba", in Italian, means cocaine.

_davide_ 37 minutes ago [-]

When I read the title 'IBM crossed a transformer with an SSM and got ‘Bamba’' I laughed so hard I woke up my kid

alex7o 3 hours ago [-]

It's just a mamba (https://github.com/state-spaces/mamba) but with a transformer. Idk where the B comes from.

rdtsc 1 hours ago [-]

So someone can get fired for picking IBM after all! Or get a bonus, depending on the organization...

iddan 3 hours ago [-]

And in Heberw it's the name of a snack made of peanut-butter-flavored puffed maize https://en.wikipedia.org/wiki/Bamba_(snack)

kridsdale1 2 hours ago [-]

arijun 2 hours ago [-]

I think the difference in allergy rates between UK and Israeli Ashkenazi Jews (10x higher in UK Jews!) [1] is strong evidence for that.

Also, they sell Bamba at Trader Joe’s now.

[1] https://www.jacionline.org/article/S0091-6749(08)01698-9/ful...

cycomanic 52 minutes ago [-]

bonzini 2 hours ago [-]

As an Italian who has tried (only) the Israeli Bamba, I can certify that it is pretty addictive.

amitport 4 hours ago [-]

Maybe?

https://en.m.wikipedia.org/wiki/Bamba_(snack)

;)

akovaski 4 hours ago [-]

https://en.wikipedia.org/wiki/La_Bamba_(song)

dantastic 3 hours ago [-]

Or (where I'm from) a school cafeteria:

https://www.thelocal.se/20221125/swedish-word-of-the-day-bam...

ofrzeta 3 hours ago [-]

Spot on. From the linked blog post "The refrain of La Bamba, the Mexican folk song that Ritchie Valens made famous, goes: Para bailar La Bamba/Se necesita una poca de Gracia. "

vienzo 1 hours ago [-]

And in Lithuanian it's a navel

4 hours ago [-]

rzzzt 4 hours ago [-]

Para bailar La Bamba / Se necesita una poca de gracia

francasso 3 hours ago [-]

SSMs never stop

beanjuiceII 2 hours ago [-]

i mean that sounds good to me

samanator 4 hours ago [-]

Yummy