Markus Harrer

Data Science on Software Data

Make invisible technical problems visible to management. This talk shows how to use data science to build a compelling case for refactoring legacy code.

Data Science on Software Data
#1about 4 minutes

The challenge of justifying legacy system improvements

Technical debt in legacy systems is difficult to communicate to management because its impact is less visible than new features or bugs.

#2about 4 minutes

The promise and failure of universal software quality metrics

Early software analytics aimed to create universal quality dashboards but failed because metrics and models are not transferable between unique projects.

#3about 5 minutes

Adopting analytics approaches for project-specific questions

Instead of reusing non-transferable results, teams can adapt the methodologies and tools from software analytics to answer their own unique, high-impact questions.

#4about 5 minutes

Using data science as a foundation for software analytics

Reproducible data science provides the necessary methodologies and tools for open and automated analysis, leveraging skills developers already possess.

#5about 6 minutes

Exploring software data types and practical analysis use cases

Analyzing static, runtime, chronological, and community data can reveal code ownership gaps, performance bottlenecks, and opportunities for modularization.

#6about 13 minutes

Analyzing code coverage with Python, pandas, and Jupyter

A live coding demo shows how to use Python, pandas, and Jupyter notebooks to analyze production code coverage data and visualize unused code packages.

#7about 3 minutes

An introduction to graph analytics for software systems

Graph analytics with tools like jQAssistant and Neo4j helps visualize and query interconnected software data like class dependencies and method calls.

#8about 1 minute

Key principles for effective software data analysis

Successful software data analysis requires focusing on solving specific problems, working openly, automating processes, and deriving actionable next steps.

#9about 8 minutes

Q&A on production code analysis and performance bottlenecks

The speaker answers questions about analyzing production codebases, sharing examples of identifying performance bottlenecks and justifying technology choices with data.

Related jobs
Jobs that call for the skills explored in this talk.

Featured Partners

Related Articles

View all articles
CH
Chris Heilmann
With AIs wide open - WeAreDevelopers at All Things Open 2025
Last week our VP of Developer Relations, Chris Heilmann, flew to Raleigh, North Carolina to present at All Things Open . An excellent event he had spoken at a few times in the past and this being the “Lucky 13” edition, he didn’t hesitate to come and...
With AIs wide open - WeAreDevelopers at All Things Open 2025
DC
Daniel Cranney
The State of WebDev AI 2025 Results: What Can We Learn?
Introduction The 2025 edition of The State of WebDev AI offers a detailed snapshot of how developers are using AI today, which tools have gained the most traction over the past year, and what these trends suggest about the future of the industry. In...
The State of WebDev AI 2025 Results: What Can We Learn?
AG
Andre Braun, GitLab
Now is the time for industrialized software development
Now is the time for industrialized software development Recently, I received a letter from my car’s manufacturer alerting me to a recall. They had discovered a defective part and wanted to replace it. It was easily fixed, and I might have forgotten a...
Now is the time for industrialized software development
BB
Benedikt Bischof
How we Build The Software of Tomorrow
Welcome to this issue of the WeAreDevelopers Live Talk series. This article recaps an interesting talk by Thomas Dohmke who introduced us to the future of AI – coding.This is how Thomas describes himself:I am the CEO of GitHub and drive the company’s...
How we Build The Software of Tomorrow

From learning to earning

Jobs that call for the skills explored in this talk.