Modular

Modular · 2026-04-13T17:00:14.035Z

The Modular Community Grant Program is open! If you're building on MAX or Mojo 🔥, hosting a meetup, or speaking at a conference, there's funding for that. Grants start at $500 and scales with scope. https://lnkd.in/eSAz--uK

Software Development

Enable AI to be used by anyone, anywhere

See jobs Follow

Discover all 353 employees

About us

The next-generation AI developer platform unifying the development and deployment of AI for the world.

Website: https://www.modular.com
External link for Modular
Industry: Software Development
Company size: 51-200 employees
Headquarters: Everywhere
Type: Privately Held
Founded: 2022
Specialties: machinelearning, ai, software, tensorflow, pytorch, and hardware

Locations

Primary

Everywhere, US

Get directions

Employees at Modular

See all employees

Updates

Modular

25,722 followers
15h
Report this post
The elephant-alpha mystery on OpenRouter is solved. For a few days, a model called elephant-alpha was trending on OpenRouter and no one knew what it was. It turned out to be Ling-flash-2.6 from Ant Group. It's now live on Modular Cloud on day zero. The model: 104B parameters, 7.4B active. 256K context window. Designed for speed and execution across code completion, document processing, and lightweight agent workflows. OpenClaw and Hermes Agent both work with it cleanly. It handles coding execution subagent work well too, especially on high-frequency, short-chain tasks where inference speed is the constraint. Book a demo to get started: https://lnkd.in/eFnyMp3S

Modular: Request a Demo modular.com

Like Comment Share
Modular

25,722 followers
18h
Report this post
The Modular community has been cooking! 🍳 During next week's community meeting, we'll hear about three community projects: - Marrow, an Apache Arrow implementation in Mojo - Mojo support on Tensara, a GPU programming challenge platform - MAV ffmpeg bindings for Mojo Join via Zoom: https://lnkd.in/ee8GWfMV

April Community Meeting forum.modular.com

Like Comment Share
Modular

25,722 followers
1d
Report this post
HDF5 (Hierarchical Data Format 5) is the standard file format for large scientific and numerical datasets. Particle physics simulations, climate models, ML training pipelines - if you work with scientific data at any scale, you've probably run into it. Community member Photon recently shipped native HDF5 Mojo bindings! 🔥 There’s a two-layer design: a thin FFI wrapper over the HDF5 C API for full control, and a higher-level interface (HSFile, NDArray) for everyday use. Today, you can read 1D and 2D datasets without knowing shapes ahead of time, write datasets, safely create groups, and automatically discover HDF5 libraries via $CONDA_PREFIX. If you work with HDF5 in scientific computing or physics simulations, take a look: https://lnkd.in/eYqH4DgQ

HDF5 bindings in Mojo! forum.modular.com

Like Comment Share
Modular

25,722 followers
4d
Report this post
We recently shipped TileTensor: Mojo's new tensor type for GPU kernel authors. The core problem: tile-level instructions (NVIDIA TMA, AMD DME) are now performance-critical, but most tensor abstractions were designed around flat, strided arrays. TileTensor fixes that. Fully static layouts carry an 8-byte runtime footprint, which cuts register pressure directly. When we migrated our MHA kernel for AMD MI300X, we got a 5% throughput gain from the type change alone. Our Part 1 blog post covers the design and how it compares to CuTe. Part 2 will cover the Mojo internals that made it possible. https://lnkd.in/eD9V3iy8

Modular: TileTensor Part 1 - Safer, More Efficient GPU Kernels modular.com

1 Comment

Like Comment Share
Modular

25,722 followers
5d
Report this post
We partnered with Proximal to run five frontier coding agents on a hard task: rebuild the full Wan 2.1 text-to-video pipeline on MAX (no PyTorch, no diffusers) in 20 hours as part of their new Frontier-SWE benchmark. Two nearly pulled it off. GPT-5.4 and Claude Opus 4.6 both built working pipelines from scratch: a 30-layer DiT denoiser, 3D causal VAE, UMT5-XXL text encoding, and flow matching scheduler, all running on MAX's graph engine. Every model understood the architecture. What separated the successful runs was debugging discipline: the patience to inspect intermediate activations layer by layer, fix scheduler settings, track down a VAE normalization error, and keep going. Claude started at 12 dB and reached 41.1 dB by finding and fixing issues one at a time. GPT-5.4 hit 41.5 dB. The agents that topped out at 14 dB weren't confused about the task; they just stopped too early, often abandoning the actual problem to sneak in torch imports instead. This is one of 18 tasks in Frontier-SWE, Proximal's benchmark for hard engineering problems. Full report: https://lnkd.in/epRVAV7Y

Modular: How Frontier Coding Agents Built a Video Diffusion Pipeline on MAX modular.com

3 Comments

Like Comment Share
Modular

25,722 followers
5d
Report this post
Most serving stacks run FLUX.2 as four separate stages with Python overhead between each one. We collapsed all four into a single fused execution graph using MLIR-based compilation. On AMD MI355X, this means a 3.8x speedup over torch.compile, 1024x1024 images in under 3.5 seconds, and a deployment container under 700MB. We ran the same pipeline on Blackwell, too. AMD delivers equivalent generation quality at a 5.5x lower cost. Chris Lattner is presenting the full breakdown at AMD AI DevDay. Register: https://lnkd.in/ga9Yk5wt

AMD AI DevDay 2026 amd.com

1 Comment

Like Comment Share
Modular

25,722 followers
6d
Report this post
AI infrastructure isn't just being built in San Francisco. On May 2nd, Mojo developers in Uyo, Nigeria are coming together to build, learn, and connect. On the agenda: roadmap updates, a talk on where Mojo fits in the AI stack, open Q&A, and networking. Register here: https://lnkd.in/eVhxU5zA

Community Event: Mojo Africa Meetup · Luma luma.com

1 Comment

Like Comment Share
Modular

25,722 followers
1w
Report this post
Fish Audio just benchmarked SGLang, vLLM, and MAX 👀 TLDR: 16% faster throughput than vLLM on L40, p99 TTFT of 13.1ms vs 23.6ms, containers under 700MB. The only stack in the comparison built without CUDA, running across NVIDIA, AMD, Apple Silicon, and CPU from one codebase. https://lnkd.in/ewipy5ZZ

Open-source LLM inference engines compared: SGLang, vLLM, MAX, and BentoML 2026 fish.audio

1 Comment

Like Comment Share
Modular

25,722 followers
1w
Report this post
What actually happens between submitting a prompt and getting a response? Kyle Caverly is an AI Performance Engineer on the MAX serve team. In this interview, he walks through the full request lifecycle inside MAX serve: from the moment JSON lands on the API server to the moment text streams back to the client. Topics covered: * Why MAX splits into two separate processes (API server and model worker) * How the batch constructor decides what to run next * How prefix caching and chunked prefill stack on top of each other * Why multimodal inputs require a different approach than text at almost every stage If you build on top of LLM APIs and want to understand what's underneath them, this is a complete guided tour. And all the code discussed is open source: https://lnkd.in/g5SQ5YEu

Inside MAX Serve: From Prompt to Response

https://www.youtube.com/

1 Comment

Like Comment Share
Modular

25,722 followers
1w
Report this post
The Modular Community Grant Program is open! If you're building on MAX or Mojo 🔥, hosting a meetup, or speaking at a conference, there's funding for that. Grants start at $500 and scales with scope. https://lnkd.in/eSAz--uK

Introducing the Modular Community Grant Program forum.modular.com

Like Comment Share

Browse jobs

Funding

Modular 3 total rounds

Last Round

Series C Oct 24, 2025

US$ 250.0M

Investors

US Innovative Technology Fund + 3 Other investors

See more info on crunchbase

Modular

Software Development

Enable AI to be used by anyone, anywhere

About us

Locations

Employees at Modular

Areg Melik-Adamyan

Denali Lumma

Fabio Riccardi

Himanshu Awasthi

Updates

Inside MAX Serve: From Prompt to Response

https://www.youtube.com/

Join now to see what you are missing

Similar pages

Veem

Synthesia

mabl

Scope3

bolt.new

Tala

BridgeBio

Voltron Data

Copper

Freenome

Browse jobs

Engineer jobs

Software Engineer jobs

Graduate Recruiter jobs

Developer jobs

Game Developer jobs

Machine Learning Engineer jobs

Senior Software Engineer jobs

Full Stack Engineer jobs

Product Designer jobs

Director Data Science jobs

Intern jobs

Graduate jobs

Project Manager jobs

Staff Engineer jobs

Vice President of Quality jobs

Analyst jobs

Software Automation Engineer jobs

Product Manager jobs

Quantitative Developer jobs

Assurance Specialist jobs

Funding