Automatic software system optimization can improve software speed, reduce
operating costs, and save energy. Traditional approaches to optimization rely
on manual tuning and compiler heuristics, limiting their ability to generalize
across diverse codebases and system contexts. Recent methods using Large
Language Models (LLMs) offer automation to address these limitations, but often
fail to scale to the complexity of real-world software systems and
applications. We present SysLLMatic, a system that integrates LLMs with
profiling-guided feedback and system performance insights to automatically
optimize software code. We evaluate it on three benchmark suites: HumanEval_CPP
(competitive programming in C++), SciMark2 (scientific kernels in Java), and
DaCapoBench (large-scale software systems in Java). Results show that
SysLLMatic can improve system performance, including latency, throughput,
energy efficiency, memory usage, and CPU utilization. It consistently
outperforms state-of-the-art LLM baselines on microbenchmarks. On large-scale
application codes, it surpasses traditional compiler optimizations, achieving
average relative improvements of 1.85x in latency and 2.24x in throughput. Our
findings demonstrate that LLMs, guided by principled systems thinking and
appropriate performance diagnostics, can serve as viable software system
optimizers. We further identify limitations of our approach and the challenges
involved in handling complex applications. This work provides a foundation for
generating optimized code across various languages, benchmarks, and program
sizes in a principled manner.