Development Note: This dataset includes “13,304 original images”. 95.9% which are 12,765 original images, is unfiltered and taken during a total of 7 days' trip. An additional 2.7% consists of carefully selected high-quality photos of mine, including my own drawings and paintings, and the remaining 1.4% 184 images are in the public domain. The dataset was used to train a custom-designed diffusion model (550M parameters) with a resolution of 768x768 on a single NVidia 4090 GPU for a period of 10 days of training from SCRATCH.
Talk about "Art", not just technology, and extend slightly more about the motivation.
The "Milestone" name came from the last conversation with Gary Faigin on 11/25/2024; Gary passed away 09/06/2025, just a few weeks ago. Gary is the founder of Gage Academy of Art in Seattle. In 2010, Gary contacted me for Gage Academy's first digital figure painting classes. He expressed that digital painting is a new type of art, even though it is just the beginning. Gary is not just an amazing artist himself, but also one of the greatest art educators, and is a visionary. I had a presentation to show him this particular project that trains an image model strictly only on personal images and the public domain. He suggests "Milestone" is a good name for it.
As AI increasingly blurs the lines between creation and replication, the question of originality requires a new definition. This project is an experiment in attempting to define originality, demonstrating that a model trained solely on personal works can generate images that reflect a unique artistic vision. It's a small step, but a hopeful one, towards defining a future where AI can be a tool for authentic self-expression.
Posted on Reddit, and it seems most people enjoy it, and also, there was one user who was skeptical about it who believed this model was either finetuned ot trained on more than 13K original images. I take this to mean the result is very successful and can officially call the name "Milestone".
"Milestone" is not a particular model name or a dataset's name; it is an experimental project to search for originality after 2022, when image generation became popular, which was Stable Diffusion 1.x trained on LAION 5B with 5 billion images from the entire internet.






 







