IndexCache, a new sparse attention optimizer, delivers 1.82x faster inference on long-context AI models

VentureBeat
Mar 26, 2026, 8:00 pm25 pts
Processing 200,000 tokens through a large language model is expensive and slow: the longer the context, the faster the costs spiral. Researchers at Tsinghua University and Z.ai have built a technique called IndexCache that cuts up to 75% of the redundant computation in sparse attention models, delivering up to…

Read Article Share
Share Article
- email
- x.com
- facebook
- pocket
- reddit
- tumblr
- linkedin
- pinterest

Trending Today on Tech News Tube

Oracle Cuts Thousands of Jobs Across Sales, Engineering, Security

Slashdot

Oracle Cuts Thousands of Jobs Across Sales, Engineering, Security 127

Microsoft set to face another major UK probe - this time over cloud and software licensing, and if it should get ‘strategic market status’

TechRadar

Microsoft set to face another major UK probe - this time over cloud and software licensing, and if it should get ‘strategic market status’ 127

Save a huge $100 on this vivid 27-inch MSI OLED monitor with fast 240Hz refresh rate — now $399.99, this MAG QD-OLED model is close to its lowest price ever on Newegg

Tom's Hardware

Save a huge $100 on this vivid 27-inch MSI OLED monitor with fast 240Hz refresh rate — now $399.99, this MAG QD-OLED model is close to its lowest price ever on Newegg 124

Belkin’s battery-equipped Switch 2 case is more than 35 percent off right now

The Verge

Belkin’s battery-equipped Switch 2 case is more than 35 percent off right now 115

Arm, Tesla chip push reshuffles supply chains, lifts AI memory demand

Digitimes

Arm, Tesla chip push reshuffles supply chains, lifts AI memory demand 114

Apple to Issue Rare iOS 18 Software Update for DarkSword Exploit

MacRumors

Apple to Issue Rare iOS 18 Software Update for DarkSword Exploit 114

Apple Will Push Out Rare ‘Backported’ Patches to Protect iOS 18 Users From DarkSword Hacking Tool

Wired

Apple Will Push Out Rare ‘Backported’ Patches to Protect iOS 18 Users From DarkSword Hacking Tool 111

Tesla expands unsupervised ‘Robotaxi’ area in Austin with only a handful of cars

Electrek

Tesla expands unsupervised ‘Robotaxi’ area in Austin with only a handful of cars 108

About Tech News Tube

Tech News Tube is a real time news feed of the latest technology news headlines.

Follow all of the top tech sites in one place, on the web or your mobile device.

Featured

Oracle believed to have cut 10,000…

Oracle believed to have cut 10,000 positions across multiple divisions as mass…

Salesforce reveals major AI overhaul for…

Salesforce reveals major AI overhaul for 'ultimate teammate' Slackbot, with over…

I use these 5 simple ‘ChatGPT codes’…

I use these 5 simple ‘ChatGPT codes’ every day — and they instantly…

Elon Musk hints at “official…

Elon Musk hints at “official ceremony” with throwback photo to close Tesla…

SpaceX Starlink Satellite Suffers…

SpaceX Starlink Satellite Suffers Mysterious 'Anomaly' In Orbit

Claude Code source leak reveals how much…

Claude Code source leak reveals how much info Anthropic can hoover up about you…

Iran’s internet shutdown proves we…

Iran’s internet shutdown proves we need to go beyond Starlink and VPNs —…

Babbel Promo Code: Up to 65% Off in…

Babbel Promo Code: Up to 65% Off in April 2026

Japan Display sells Tottori Fab, citing…

Japan Display sells Tottori Fab, citing asset efficiency and continuing AutoTech…

South Korean chipmakers stockpile…

South Korean chipmakers stockpile helium, diversify supply beyond Qatar