Roadmap
Vision
just-makeit is the fastest path from algorithm idea to production Python C extension.
Zero boilerplate. Full test coverage from day one. Just works.
The goal is simple: you should be able to think of an algorithm, run one command, and have a complete, tested, packagable C extension project — with a clean C library that can also be linked from Rust, C++, or anything else. The scaffolding should disappear. Your algorithm should be all that remains.
v0.2 — Performance scaffold ✓ shipped
Real algorithms are hot. The generated code should be ready for it.
Planned and delivered:
--perfflag onnewandinit— generatesjm_perf.hwithJM_FORCEINLINE,JM_HOT,JM_LIKELY,JM_UNLIKELY,JM_RESTRICT,JM_ALIGNED. All macros are C99-compatible with safe no-op fallbacks.ENABLE_SIMDCMake option — enables-march=native -ffast-mathon GCC/Clang. Off by default; opt in per build./* #pragma omp simd */annotation on the generatedsteps()loop.
Delivered beyond plan:
just-makeit perfcommand — upgrades an existing project in-place without touching any user-written code. The--perfflag is for new scaffolds;just-makeit perfis for projects already in progress.JM_DEFINE_STEPS(fn, state_t, sample_t, LENGTH, BATCH, CHUNK)macro — stamps out the outer dispatch loop from three separated concerns: algorithm history depth, SIMD batch width, and scratch-buffer tuning. The user writesstep(). The macro generates everything else.sliding_correlatorexample — provesJM_DEFINE_STEPSis algorithm-agnostic: complex cross-correlation with a different state layout and complex multiply uses the exact same macro invocation as the FIR filter.docs/perf.md— full reference for the macro set andJM_DEFINE_STEPS.
v0.3 — Benchmarks and example test runner ✓ shipped
Delivered:
- C and Python benchmarks generated with every component (
make bench,make bench-save,make bench-compare; pytest-benchmark + doppler-style C binary). Bench files are in_STATE_TEMPLATESso they regenerate onjust-makeit add. examples/end-to-end test runner —tests/test_examples.pyauto-discoversexamples/*/test.py;test_all_examples_have_test_pyenforces that every example directory ships a test driver.examples/README.md— contributor guide explaining the.steps/naming convention,assemble.pyweaving, and thetest.pycontract.docs/examples/retired — stale duplicate ofexamples/*/README.md.
v0.4 — C library distribution ✓ shipped
Today just-makeit targets Python consumers. v0.4 makes the generated project a first-class C library too — distributable to C, C++, and Rust via the standard mechanisms, with no changes to user-written algorithm code.
Core idea: each component's _core.c compiles once (as a CMake OBJECT
library) and links into both the Python DSO and a combined libmy_dsp.so.
No duplicated object files. No diverging codebases.
flowchart TD
SRC["**your C source**\ngain_core.c · bpf_core.c · …"]
SRC --> CLIB["**libmy_dsp.so**\ncombined shared library"]
SRC --> PY["**Python package**\ngain.cpython-*.so\nbpf.cpython-*.so"]
CLIB --> C["**C / C++ / Rust / …**\npkg-config · find_package"]
PY --> PYUSER["**Python**\npip install .\nfrom my_dsp import Gain"]
Delivered:
- Each component's
CMakeLists.txtgains an OBJECT library target (gain_coreOBJECT); the Python DSO and bench binary link against$<TARGET_OBJECTS:gain_core>instead of a static archive. - Top-level
CMakeLists.txtaccumulates all OBJECT targets intolibmy_dsp.soand addsinstall()rules for the library, headers, pkg-config file, and CMake config package. just-makeit initpatchestarget_sources(${PROJECT_NAME}_lib …)in the top-level alongside the existingadd_subdirectorypatch.cmake/<project>.pc.in— pkg-config template.cmake/<project>-config.cmake.in— CMakefind_packagetemplate.native/inc/<project>.h— umbrella header.
Install story:
cmake --install build --prefix /usr/local
gcc $(pkg-config --cflags --libs my-dsp) main.c -o main
find_package(my-dsp REQUIRED)
target_link_libraries(my_app PRIVATE my_dsp::my_dsp)
v0.5 — SIMD operation macros (jm_simd.h) ✓ shipped
--perf already gives you JM_DEFINE_STEPS for the outer dispatch loop.
v0.5 fills in the inner loop: a new jm_simd.h header provides
width-portable operation macros so step_batch() can be written once and
compile to AVX-512, AVX2, or scalar without any #ifdef in user code.
What you get:
/* Before v0.5 — raw intrinsics, architecture-locked */
__m512 acc = _mm512_setzero_ps();
for (int k = 0; k < N_TAPS; k++)
acc = _mm512_fmadd_ps(_mm512_set1_ps(state->coeffs[k]),
_mm512_loadu_ps(window + k), acc);
*out = _mm512_reduce_add_ps(acc);
/* After v0.5 — portable, compiles to best available ISA */
JM_VEC_F32 acc = JM_ZERO_F32();
for (int k = 0; k < N_TAPS; k++)
JM_MAC_F32(acc, window + k, state->coeffs[k]);
*out = JM_HSUM_F32(acc);
SIMD tiers selected at compile time:
| Tier | JM_SIMD_WIDTH_F32 |
JM_VEC_F32 |
|---|---|---|
| AVX-512F | 16 | __m512 |
| AVX2 + FMA | 8 | __m256 |
| Scalar | 1 | float |
Full macro set:
| Macro | Operation |
|---|---|
JM_VEC_F32 / JM_VEC_F64 |
Width-aware vector type |
JM_ZERO_F32() |
Zero accumulator |
JM_SPLAT_F32(x) |
Broadcast scalar to all lanes |
JM_LOAD_F32(ptr) |
Unaligned load |
JM_STORE_F32(ptr, v) |
Store |
JM_ADD_F32(a, b) |
Element-wise add |
JM_MUL_F32(a, b) |
Element-wise multiply |
JM_FMA_F32(acc, a, b) |
acc += a * b |
JM_MAC_F32(acc, ptr, s) |
Load + FMA in one call |
JM_HSUM_F32(v) |
Horizontal reduce to scalar |
jm_dot_f32(a, b, n) |
Full dot product loop |
_F64 variants exist for all macros. jm_perf.h includes jm_simd.h
automatically; it can also be included standalone.
Also added to jm_perf.h:
| Macro | Effect |
|---|---|
JM_UNROLL(n) |
Loop unroll hint (#pragma GCC unroll n) |
JM_ASSUME_ALIGNED(ptr, n) |
Pointer alignment assertion for auto-vectorisation |
JM_PREFETCH(ptr, rw, loc) |
Software prefetch (__builtin_prefetch) |
Note on zero-dependency wheels: the v0.4 OBJECT library design already
solves this — Python DSOs embed the C code directly and have no runtime
dependency on libmy_dsp.so. Pre-compiled wheel distribution is handled by
cibuildwheel, which the end project configures for its own platform targets.
v0.6 — Type-parameterised I/O ✓ shipped
Generated step() signatures were hardcoded to float complex. Real
algorithms use whatever type fits: float, double, float _Complex,
double _Complex. v0.6 makes the I/O types explicit flags.
Delivered:
--arg-type TYPEand--return-type TYPEonjust-makeit newandjust-makeit init— supported types:float,double,float _Complex,double _Complex. Both default tofloat _Complexfor backward compatibility.- All generated artifacts (C header, Python binding,
.pyistub, C test, benchmarks, NumPysteps()loop) derive types from the declared flags. No manual patching needed after scaffolding. arg_type/return_typepersisted injust-makeit.tomland read back byjust-makeit addso regenerated files stay consistent.examples/sliding_power— demonstrates--return-type float: the estimator receivesfloat complexsamples and returns real-valued signal power.
v0.6.1 — Multi-component quality-of-life ✓ shipped
Fixed:
just-makeit initnow automatically splices the new component'sfrom .comp import Compimport and__all__entry into the existingsrc/<pkg>/__init__.py. Handles absent__all__, multi-line__all__, and user additions; idempotent.- Generated
pyproject.tomllistspytest-benchmarkas adependenciesentry sopip install .provides everything needed to runmake bench. JM_UNROLLcomment corrected: directive (unconditionally obeyed), not an advisory hint likeJM_HOT.
Added:
examples/dsp_toolkit— two-component library (Gain + EMA) that exercises the fullnew→init→ build → test → use workflow in CI.docs/workflow.mdrewritten around two end-to-end scenarios.
v0.6.2 — CI split ✓ shipped
Changed:
- Post-publish smoke tests extracted into a dedicated
artifact.ymlworkflow, triggered viaworkflow_runon Release success.release.ymlnow handlestest → build → publishonly; artifact tests run after PyPI propagation.
v0.6.3 — Realistic artifact CI ✓ shipped
Changed:
artifact.ymlrewritten around thefir_filterexample — a real algorithm with array state (coeffs,delay) and a scalar param (gain). The full workflow runs in CI: scaffold → implementfir_filter_step→just-makeit perf→make && make test→just-makeit init gainwith__init__.pysplice check →cmake --install→ pkg-config consumer → CMakefind_packageconsumer. The C consumers assert a correct impulse response[0.25, 0.50, 0.25, 0.0, …].
Ideas under consideration
These are not yet scheduled but are worth tracking:
just-makeit cicommand —just-makeit ci --provider github|woodpeckeradds a CI config to an existing project, similar to howjust-makeit perfupgrades the build in-place. Targets both GitHub Actions and Woodpecker+Gitea workflows.- NumPy ufunc registration —
--ufuncflag wrapscomp_fnas a proper NumPy generalized ufunc, enabling broadcasting andout=support - Windows / MSVC CI template —
just-makeit newoptionally generates a GitHub Actions workflow with a Windows runner - Interactive wizard —
just-makeit newwithout arguments drops into a short prompt-driven setup for users who prefer guided over CLI flags