Instantly generate UML diagrams from code in GitHub repositories
Diagrams automatically update when you push code using git
This page is dedicated to answering your questions.
Please send any of your feedback to email@example.com - I am most interested and will reply to all emails, and will update this Blog/FAQ page with useful information.
Have you ever tried to reverse engineer Java code into UML using various Eclipse plugins? Its a nightmare of setup and complexity, with pretty poor results. And the rising star of the programming world, Python doesn't get much love in the UML world either. GitUML makes creating UML diagrams super easy and accessible.
There are some key ideas in GitUML that make it relevant to today's breed of programmer. Here is a summary:
I know UML is a bit 90’s but let’s not throw out stuff that when used wisely, might help developers and teams. I’ve tried to make UML easier to access and more relevant to today’s developers.
If you are not into UML, fair enough - but some people are. The popularity of PlantUML textual UML markup is high (e.g. 1.6 million downloads for the vscode PlantUML plugin alone). GitUML simply translates Python or Java into PlantUML and then you can add your own extra PlantUML to customise, saving time.
The key is to break out of the rigid UML mindset and use your own extra PlantUML markup to customise diagrams the way you want - and have them auto update as the code changes as a result of GitHub commits. Break the rules of UML and use what works for you. There's nothing wrong with boxes and arrows - a picture can be worth a thousand words - it won't have the detail a programmer needs, but its a way into understanding the code. View the GitUML gallery and see how poweful the diagrams are. Create your own and publish them to the community!
Sure, UML2, designed by an ivory tower committee, went off the rails and became so complex and convoluted it put everyone off UML. Then with the rise of agile methodologies, unit testing and advanced language syntax, UML became even less relevant as it could not hope to represent modern code anymore.
The answer here is not to try to use UML to fully represent code, nor to expect UML to ever be "executable", as glorious as that goal sounds (and acknowledging that some progress has been made on this frontier). UML is useful sometimes in designing code, but even then - let's not generate code from UML either - that's not how modern programmers want to work.
Instead, automatically reverse engineer code into UML as part of an ongoing workflow. The resulting class/module/package diagrams with lots of comment box annotations can't hurt a project - especially for newcomers trying to understand the big picture.
If you can onboard a new programmer more quickly by giving him/her a few key diagrams, why wouldn't you? Is anti-diagramming an ideology now?
Thus a key idea with GitUML is to honour programmers wishing to focus on working soley with code, and simply offer UML class diagrams as an automatic reverse engineering of their work. Nobody has to change their work practices and try to use UML to design code, which is no longer a popular use case. This is a pure reverse engineering tool, with minimal input from developers needed - no fancy diagram editor - just click on checkboxes next to files you want visualised.
Another key idea with GitUML is to address one of the problems with "documentation": that diagrams are always out of date. With GitUML, diagrams automatically update when you push code using git.
GitUML synchronises diagrams with git repos and auto keeps them up to date - so that the diagrams don't get out of date - the default is to lock a diagram to code in a specific commit, but you can refresh any time to update to the latest master, or whatever branch you are working on.
I've added a lot of quotes on the GitUML homepage on the cost of onboarding programmers, and the cost of losing programmers.
I'm hoping that GitUML becomes a great tool for a company who desperately needs an experienced programmer who is leaving an organisation to rapidly document his or her big picture knowledge of a codebase. Conversely, for helping to onboard new programmers more rapidly - when there are a few key "documentation" diagrams available when they start. Open source projects could potentially onboard more contributors when there are UML diagrams available showing the overview of a project's architecture.
If there are any further studies on how diagrams increase the comprehensibility of source code, please let me know and I will include references to it on this site.
Python "modules" become "visual boxes with the 'M' stereotype", similar to classes which have the stereotype 'C'.
Insider's Tip: GitUML is actually built in Python 3 and Django, is web based, so there is nothing for users to install.
Browse through community UML diagrams on GitUML.
To publish your diagram, simply click on the "publish" checkbox when editing your diagram. It will then be visible in the GitUML gallery of public diagrams.
The ability to visualise popular GitHub repos as diagrams is interesting and fun - the new diagram wizard actively seeks out popular Python and Java repositories on GitHub and lets you select them via a combo box.
The wizard lets you surf through popular and "awesome" curated GitHub repositories and visualise the architectures and patterns in them.
There is no drag and drop diagram editing in GitUML - because the vision is that we want diagrams to be created and re-created automatically, based on the actual source code residing in GitHub. We don't want to waste programmers time creating artifacts that go out of date.
Programmers can add additional PlantUML markup which gets blended with the diagram generated from GitHub code, thus adding a lot of flexibility. Even if a diagram changes slightly over time (e.g. new methods are added to a class), the PlantUML markup will still blend in ok, in most cases. Here is an example of some additional PlantUML markup
A -- B
this markup will add an association line between class (or module) A and class B.
Of course, if the class to which your PlantUML markup refers is no longer there in the latest commit, then you will need to revise your custom PlantUML markup - otherwise, it is simply blended into the latest diagram.
GitUML only requires that you select repositories and files using point and click. Include and exclude classes, again using checkboxes. The reasoning here is to no waste time with an interactive diagram editor. Modifications to the diagram come from the source code - plus (optionally) any additional custom PlantUML markup text.
If you are a Python developer and prefer a desktop UML tool for Python, with full interactivity (drag drop, zoom etc) see my open source desktop UML tool for Python, Pynsource - reverse engineer Python into UML class diagrams.
This year's major new release of Pynsource now parses Python 3. Has zoom, layout, ASCII UML and PlantUML rendering support. Pynsource is also, as far as I know, the only UML tool that recognises Python instance attributes (not just class attributes). This means that expressions like self.myattr will result in a proper attribute “myattr” in the resulting UML class. Ready to run binaries are available for Mac, Windows, Ubuntu 18 and 16, Snap - as well as an open source Github repo. Community edition free and open source.
You can add notes to the actual diagram itself, or use the full featured HTML editor to add comprehensive documentation under each diagram. Click on any diagram in the Gallery to see the resulting page.
Share your diagrams - you can either share the page link or share a link to just the UML diagram image itself. The latter is a permalink that will never go out of date, even if GitUML itself shuts down.
GitUML basically converts
Python or Java Code -> PlantUML markup -> UML diagram
and renders the diagram image via the public PlantUML server. GitUML is limited by the limits of that rendering service - it won't render super huge diagrams. If you run your own PlantUML server you can bypass this limitation and create extra large diagrams.
Arguably UML is best used judiciously to understand smaller chunks of architecture that need documenting and communicating. As you refine your diagram, you can check and uncheck the files till you get the interesting subset you want.
Thanks for all the feedback so far. Re the free plan - I'd be interested in what people think is a reasonable set of limitations and will tweak the settings. Same with the Team plan pricing - I'm expecting 99% of people will just go for the Pro plan which is so cheap ($2/month) that I was hoping to make money off bigger corporates etc. with the Team plan, which would subsidise the majority of devs on the Pro plan. Again I am open to pricing ideas, as this is just getting off the ground and really appreciate feedback at this early stage.
Remember, free users can add as much of their own additional PlantUML markup as they want - that's not limited. As there can be many Python classes per module, I'm hoping people should be able to create reasonable diagrams with free accounts.
FYI you can also upload Python and Java files and visualise them as UML if you select "New Diagram / Upload files". You cannot permanently save those diagrams, as I don't want to get into the business of storing people's source code. The upload feature is more relaxed about the number of classes/files you upload - currently I'm not enforcing any limits, though, as I say, the public plantuml.com rendering server limits the size of diagrams it will render.
I'm hoping to create a "stack overflow" type commenting system so that the best architectural diagrams get voted to the top, for each GitHub project.
A diagram showing classes, their attributes and methods - and the relationships to other classes or modules is a handy thing to have. But for many programmers, a diagram is of limited use when it does not go into enough detail to be useful to a programmer who needs "code level" specific detail.
Thats why I'm working on a new set of UML diagramming conventions that blend actual code fragments, class + sequence diagrams together, incl. step by step walkthroughs numbering annotations to walk the programmer's eye through a diagram. I have tentatively called this notation "Hybrid UML" and will be publishing an article on Medium about it soon. If you are interested in this area of research, please contact me and I'll give you access to early drafts of my paper. You will be able to use the flexibility of PlantUML markdown to create "Hybrid UML" diagrams on GitUML.
Please send any of your feedback to firstname.lastname@example.org - I would be most interested and will reply.