Thursdays, 1.30pm-4.20pm, Fall 2024, Fine Arts building room ACW 102
Instructor: Graham Wakefield g rrr w aaa a t yor ku do t ca
Online resources
Almost everything about this course will be linked from https://alicelab.world/digm5010 -- bookmark this URL -- or it will be in eClass
Synopsis: The purpose of Foundations is to equip and assist students to undertake graduate-level research in Digital Media.
DIGM5010 is foundational core course in the Digital Media Graduate Program (MA/MSc/PhD), co-organized by the Department of Computational Arts in the School of the Arts, Media, Performance & Design, and the department of Electrical Engineering and Computer Science in the Lassonde School of Engineering, York University.
"The Graduate Program in Digital Media provides highly qualified students with the opportunity to do specialised hybrid research work in a program that uniquely combines computational science and artistic practices. Work in digital media focuses on a broad range of current and emerging forms of digitally supported media, with applications that range from computer games to interactive art."
The Digital Media graduate program's academic objectives include promotion of an interdisciplinary approach to computational art-making and technology development, providing students with 21st century “real-world” skills in tandem with research acumen. To work and conduct research in this area means skillfully bridging literacies in art, science and engineering practices.
Developing these research literacies is what DIGM5010 is all about.
The goals of the Foundations course are therefore:
This means:
Establishing these foundations is ultimately evaluated through the potential to understand, transfer, and extend published research in these fields into new creative applications, recreating or mutating established research results, projects, or works to the specific interests of your research area(s) and creative domain(s).
(Reading, writing, making!)
Format and evaluation
Each weekly meeting will vary in format, but always at the heart is the discussion between all of us. This isn't a course with a static syllabus of material -- it is an adaptive living system.
The course essentially has two parallel threads.
In one thread, you will work through the stages of developing a research topic and question into a publishable research paper. This includes several milestone stages (some with course evaluation components for a total of 75% of the course grade).
In the past years, some students have also been successful in submitting their final papers to real academic conferences. For example, in 2024, all 12 students completed their papers, 8 of whom submitted them to the ISEA conference (a leading conference in our field), and 4 of these were accepted and presented at the conference!
In the other thread of the course, we will examine digital media research and research-creation with a focus on practice. We will often dive into collectively coding or algorithmically reconstructing and reinterpreting work drawn from the literature, nestling theory in practice (and practice in theory). The specific topics vary from year to year and are adapted where possible to the interests of the group (and the expertise of the instructor).
You are also expected to bring work to share, making things in response to our discussions and course discussions. You should be keeping your notes, studies and practices documented in a journal, which you will submit at the end of the course. Class participation, making and journalling make up 25% of the course grade.
Note: The grading scheme for graduate study at York is as follows: 90+ A+ (exceptional), 85+ A (excellent), 80+ A- (high), 75+ B+ (highly satisfactory), 70+ B (satisfactory), 60+ C (conditional), <60 F (fail) or I (incomplete).
Schedule
week 1: what is research
week 2: what is computation
| Date | Topic | Activity | Research Paper Workflow |
|---|---|---|---|
| 9/4 | What is Research? | Call for Papers & discussion | Identify motivations, areas, research communities |
| 9/11 | What is Computation? | Computational Sketching | Focused topic selection |
| 9/18 | Reproducing Research: Yellowtail | Live Coding | Initial reading list |
| 9/25 | Introductions | 1-on-1's | Problem/thesis statement |
| 10/2 | Digital Audio and Sound Synthesis | Live Coding | Annotated bibliography (outline literature review) |
| 10/9 | Paper 1-on-1's | Complete paper outline | |
| Reading Week | Complete first draft | ||
| 10/23 | Explorable Explanations | Audio II, Data Visualization | Complete first draft |
| 10/30 | How to peer review | Peer review session | Revisions |
| 11/6 | GPU programming with GLSL | Paper 1-on-1's | Revisions |
| 11/13 | Final paper | ||
| 11/20 | Video/Tutorial | ||
| 11/27 | Final presentations | Post-mortem reflection |
Class recording
Classes are in-person. However, I usually prefer to open a Zoom session for my classes, and use share screen for the most part.
I normally also record sessions and share recordings with the course participants (so long as no participants have objections to this), as I have heard that it has been useful for many students to be able to review sessions after class hours.
Synchronous attendance is highly recommended given the importance of critical discussion on the development of your studies in the program and your research/research-creation; but if you are expecting challenges in attending all times please do not worry. Having the sessions recorded alleviates some inevitable challenges of the graduate situation, and moreover allows some time to review any material if it seemed to fly too fast. You are all coming from different backgrounds so this is also inevitable.
Sep 4, 2023 Class Recording
Hello and welcome!
We are in a program of digital media, co-run by computational arts and computer science. Art & science as "two cultures", meeting through a cybernetic medium of computation.
But perhaps "meeting" is too gentle a word. Sometimes it feels like a tectonic collision. To discover something new, we cannot follow a single path. Collisions between trajectories derail and open into new possibilities, that wouldn't be found if you only followed trend lines or deductive trails.
So, be comfortable to get out of your comfort zone. Learn from each other. It's OK to be awkward. Children are the most awkward, and the fastest learners, so let us be children again from time to time -- so long as we work hard at it!
So for example, what does "computation" mean to you?
What does "research" mean to you?
Deep dive presentation & discussion
Take some time to think about this:
Through this course you will write a research paper and take it through the process of literature review and development, drafting a paper in response to a call for research, and peer review and completion thereof, modeled on real processes of academic research publication. The call for papers we will use is adapted from calls for the SIGGRAPH Art Papers and the ISEA (International Symposium on Electronic Arts).
(Broad terms of call, with theme(s) and topics)
The "DIGM Computational Art/Sci Symposium" acts as a bridge between art and technology to rethink and explore our future. The dream is to bring art and engineering communities together, as has been so fruitful in history. We understand art in its broadest sense, encompassing different fields from fine art to design and architecture. Submissions exploring how computer science and interactive techniques — especially those linked to recent developments — that relate to questions of the future are particularly encouraged. We encourage submissions that discuss and explore within the fields of electronic arts, creative technology, digital culture and all manners of art-science-technology collaboration not yet born. We would like you to engage in the Renaissance of the 21st century!
Topics of interest include, but are not limited by:
(Submission categories)
Prospective authors may consider one of the following categories as they prepare their work for submission.
(Paper requirements)
We invite paper submissions for original and substantial research, both traditional and non-traditional, that has not been previously published. Papers will follow full academic practices, through an extensive literature review, argumentation, and observations or evaluations of impact. All papers must follow academic standards and will undergo blind peer review based on quality, relevance, originality, and impact.
(See notes on the Final Paper submission for details about formatting.)
Submissions will be uploaded online. We also ask you to add a very short text explaining why your submission is important for the academic and wider community, from raising critical issues to opening perspectives or generating solutions. In doing so, we aim to enhance the impact of your cultural contributions.
Authors will be required to present their papers with a duration of 10 minutes.
Sep 11, 2023 Class Recording
This week's paper workflow
Here are some recent conference calls (I have also been sharing these on the DMgrad email list):
Survey responses
From the responses of the survey last week, I can see that some of you have some advanced experience in some programming languages and creative software environments, with Javascript especially as well as Max, Unity being the most frequent ones, and Python, C, GLSL, TouchDesigner, Godot, and Unreal are also mentioned. Some of you have only beginner level experience with these.
Topics of interest mentioned (repeated mentions marked with asterisks):
This is more than we can realistically cover in one semester, especially in a course in which at least half of our time is spent on research writing, but I can certainly do some deep-dives into some of these, and I will try to weave through the topics with the technical platforms mentioned!
I also compiled a list of conferences and journals for some of these topics -- which you might use in your search for related and exemplar papers for the writing:
Goals that you mentioned include:
These are great!
A futorologist said to me: to understand the future, we must go much further into the past; to see patterns that recur and trajectories behind the present. However this doesn't mean entering the future through a rear-view mirror (McLuhan's warning); it is to understand the language and concepts we see the world in today, and see their gaps and limitations.
Let's step back a moment, and understand computation from its genealogical emergence, and conceptual foundations:
What is computation? - History, theory, implementation; programs as data
It can be helful sometimes to step into the shoes of those that have gone before, to see how we ended up here, and what we may have lost or missed along the way.
Take John Maeda, the designer & MIT Media Lab professor, who pioneered reactive graphics in the era of the Macintosh and CD-ROM.
Design By Numbers (John Maeda, 1999)
"Drawing by hand, using pencil on paper, is undisputedly the most natural means for visual expression. When moving on to the world of digital expression, however, the most natural means is not pencil and paper, but rather, computation. Today, many people strive to combine the traditional arts with the computer, and while they may succeed at producing a digitally empowered version of their art, they are not producing true digital art. True digital art embodies the core characteristics of the digital medium, which cannot be replicated in any other."Computation is intrinsically different from existing media because it is the only medium where the material and the process for shaping the material coexiist in the same entity: numbers. The only other medium where a similar phenomenon occurs is pure thought. It naturally follows that computational media could eventually present the rare opportunity to express a conceptual art that is not polluted by textual or other visual representation. This exciting future is still at least a decade or two away. For the moment, we are forced to settle with society's current search for true meaning in an enhanced, interactive version of the art that we have always known."
Maeda studied with Muriel Cooper and Paul Rand, and redefined the use of electronic media as a tool for expression by combining computer programming with traditional artistic technique, which helped lay the groundwork for interactive motion graphics as seen on the web today. (This itself is part of a longer genealogical history that traces back to a movement of thought in the 1960's regarding how computers can augment intelligence, the nature of creativity -- with implications for AI development today)
Other key insights from Maeda's interactive graphics explorations:
Maeda's courses and research in the Aesthetics & Computation group at MIT inspired a whole generation of creative coders. He taught Casey Reas and Ben Fry, and his Design By Numbers software was the precursor of their Processing (which led to P5.js).
Maeda's courses challenged students to rethink the medium from its most basic elements. A typical assignment:
Given a mobile point in space over a finite rectangular area, create a parametric drawing that illustrates repetition, variety, or rhythm. MAS 964 P
Golan Levin was one of Maeda's students, and went on to focus specifically on the creation of audiovisual instruments, responding directly to Maeda's project. He is now a professor at Carnegie Mellon University, and a key figure in the Art & Code community. There’s a lot to draw from his Master's thesis, both theoretically and practically.
(Also note the document structure, as an example of a thesis in our field.)
For example, look at Curly and Yellowtail -- perhaps we can try to recreate this as our first example of "reproducing research".
Let's start with a sketching application -- but let's think about how we can use computation to augment or transform our gestures in some way.
Today's code sketching progress:
Examples from 2024's class
Homework
Continue with your topic research for the paper, beginning to build your reading list.
Please add 2-4 slides to introduce yourself in The Google Slide deck here: https://docs.google.com/presentation/d/1y77v3C8q-2MuFzcvDynw2mqwjO-ng7peo9vcHvXhQtU/
Have a good read of Golan Levin's Master thesis, and note down your thoughts and questions about it for our next coding session!
How would you respond to the example challenge, "Given a mobile point in space over a finite rectangular area, create a parametric drawing that illustrates repetition, variety, or rhythm."?
Sep 18, 2023 Class Recording
As the first output in the paper writing process, you will prepare a collection of papers related to your research topic/question.
Literature review part 1: Reading List
You will investigate a topic within a sub-area of the digital media realm that intersects with something of value to your own research goals in the program.
In the first phase you will build up a reading list on your chosen topic. Please create a document that you can share (e.g. a github page, a word doc, a google doc, etc.) to collect your notes and references as you develop this reading list.
Here's an example reading list for a grant proposal I'm currently developing
As we saw in class, good research needs a good research question, but that doesn't always become apparent at first. Start wider until you are ready to go deep & narrow. Seek out key papers, conferences, and other key resources for the topic.
Some investigative tips:
cited by and related articles trails to find new leads. Aim to find the most significant (qualitatively & quantitatively) papers. You should be aiming for a collection of 25-50 potentially interesting papers on the topic at this point. Let's continue with the sketching.
First a quick note -- what we are doing looks a bit like p5.js. In fact, if we remember to refactor code that we will re-use into re-usable functions, then it might start to look even more like p5.js -- maybe we will have
line()andbackground()etc. That's good: we are in the stage of reproducing research. And if we find there are moments where we want to do things a little differently, because of the needs of our project, that's good too -- we aren't limited to what's already given because we know how to remake it, and maybe we'll have a discovery that can advance research!
We saw how we can draw in response to mouse/touch movements, and add generative variation to them.
We talked about Maeda's comment "the most interesting pixels are the mouse" and this represents not just space but also time. How can we use the timing of a drawing gesture to modify the result? Can you think of ways to use speeds, rhythms, echoes, ?
A more complex example, inspired by Paul Haberli's Dynadraw:
Example script from a previous class
With these steps, we should be in a position to attempt to reconstruct Golan Levin's Curly/Yellowtail, for example.
This is an example of reproducing research. First we should sketch out what is required based on the source material, and work from there to refine from a sketch through pseudo-code and implementation of components until we have the final result.
"Yellowtail repeats a user's strokes end-over-end, enabling simultaneous specification of a line's shape and quality of movement. Each line repeats according to its own period, producing an ever-changing and responsive display of lively, worm-like textures."
Detailed description from page 73 of the thesis:
"a user’s linear marks transform into an animated display of lively, worm-like lines. After the user deposited a mark, the system would then procedurally displace that mark end-over-end, making possible the simultaneous specification of both a line’s shape as well as its quality of movement. Straight marks would move along the direction of their own principal axes, while circular marks would chase their own tails. Marks with more irregular shapes would move in similarly irregular, but nonetheless rhythmic patterns."
The " screen space obeyed periodic (toroidal-topology) boundary conditions, such that marks which crossed the edge of the screen would reëmerge on the screen’s opposite side, rather than disappearing altogether."
Notice also the self-observation and critique, see p79. Although this project does not achieve the goal of the thesis, these observations inform the progress that follows. This is a positive research path.
OK so let's start by pseudo-coding Yellowtail!
Here's what we ended up with as pseudo-code in class, before we started coding:
there is a canvas
state:
mouse: x, y, buttonstate
time
currentpath = null
list of finished paths
start position
list of segments (dx, dy change vectors)
pointerdown:
create a new currentpath object, with start position at mouse x,y & t
pointerup:
if currentpath
add my currentpath to the list of finished paths
currentpath = null again
pointermove:
if currentpath exists
add mouse dx,dy & t to currentpath's list of segments
animate:
for each path of finished paths
remove 1st segment (shift)
(something about coordinates)
stick it onto the end (push)
wrap around canvas width/height
e.g. if x > width; x -= width, etc. for 4 boundaries
drawpath:
begin position at path's start position
for each segment of the path
line from last position to new position by adding segment change
(path, moveto, lineto, stroke)
draw:
clear screen
for each path of finished paths
drawpath(line)
if currentpath exists
drawpath(currentpath)
And here's the final code we ended up with:
Here's a more refined version from last year's class:
Please continue working on extending and mutating this into a new direction! We will share each other's codepens in the next class.
Some rules of thumb while coding:
Use the simplest limits you can -- e.g. limiting yourself to drawing only black lines. More colour, shape and style variations can always be added later. Let's focus on behaviour first.
Break a problem down into sub-problems. Approach the problem from a simpler approximation first -- the simplest version. E.g. make it work in a static way before a dynamic way, or make it work for one, before making it work for many, etc.
Try to work out a problem in pseudocode first -- just write it in commments, draw it on paper, etc, any form that is concrete will help to see the problem more clearly, and diving head first into code isn't often the right thing to do. Once the method becomes clearer, start converting pseudocode into "minimum viable" code.
Use event handlers (draw-frame, mouse, keyboard, timers, ...) to animate and interact with things.
Figure out working conditions logically from basic requirements. E.g. for anything to animate we're going to need to clear the screen on each frame, which means we're going to have to redraw everything every time, which means...
Use state (variables for numbers, strings, objects, arrays) to make things exist over time. Once captured, data can be replayed, modified, etc. Often you can represent state in a few different ways, and the choice will make some processes easier than others.
Test often. Each time you add one minor element, make sure it works for all likely input.
Handle special cases: starting values, boundary cases, error handling...
Don't worry about trying to make anything optimal -- make the most naive way that works, then refine from there.
Use abstractions (functions, objects) to encapsulate and structure ideas more easily & clearly. Any time you feel like you are writing the same code several times, replace it with a function or loop. Separate out reusable "support routines" from the code that represents the main ideas.
Comment the code and use good variable names -- you'll thank yourself in the future when you come back to it! (And anyone else looking at the code will thank you more -- remember research is about sharing!)
Take notes as you go. At any time you might have an idea of a different direction to explore -- you can only do one at once, so write them down! Even if they are just comments in the code.
Make many copies, saving a version (in Codepen you can do this via a Fork) for each minor milestone. If it goes wrong but in an interesting way (a happy accident), save a version of that too.
Sep 25, 2023 Class Recording
Introduction Presentations -- let's get to know each other, and our diverse backgrounds. And let's hear about your research topic areas and questions!
As we move through the next stages of your paper preparation, you will begin to refine and revise your paper's research question or problem statement.
A research problem statement is the foundational element of any academic paper that clearly articulates the specific issue, gap, or question your research addresses. It serves as the bridge between existing knowledge and what your study aims to contribute to the field.
A good problem statement reflects the kind of intellectual sophistication expected at the graduate level and beyond, where research moves beyond summarizing existing knowledge toward generating new understanding.
This can be surprisingly difficult to do well, but it is essential -- and the better the problem statement, the easier everything else becomes.
The key components of a problem statement (not necessarily in this order):
What it should not contain:
What is the purpose of a problem statement?
That last point can be especially important -- think about this from a reviewer's point of view. What would make them consider this paper worth publishing, or this proposal worth funding?
To a reviewer, a strong research problem statement shares several characteristics:
Research problem statements vary considerably across different fields within arts and sciences. In the humanities, problems often emerge from interpretive gaps, textual ambiguities, or underexplored cultural phenomena. For example, a literature scholar might identify how existing criticism has overlooked the influence of specific historical contexts on an author's work. In the natural sciences, problems typically arise from observational anomalies, theoretical inconsistencies, or the need for new methodological approaches. In the arts, problem statements may be focused on analysis of works and practices, or the development of new practices or forms of expression, or particularly in our field, how new technologies make us rethink norms and practices in the arts. See guidance from Emily Carr here and here
Common pitfalls to avoid
Look at examples I suggest, as a productive activity, that you look at the papers in your reading list, and identify their problem statements. Usually this is given in a summarized form in the abstract, and with a more fleshed out form in the Introduction. For each one, identify the context, gap, question, addressability and significance. Imagine you are a reviewer: consider which ones you think are well-formed and clear, and which ones have pitfalls as written. This exercise can help you to better write your own problem statement.
Here are some from my current reading list
Hemment, Drew, Cory Kommers, Ruth Ahnert, Maria Antoniak, Glauco Arbix, Vaishak Belle, Steve Benford et al. "Doing AI differently: rethinking the foundations of AI via the humanities." (2025). https://www.turing.ac.uk/news/publications/doing-ai-differently
"Artificial Intelligence is rapidly becoming global infrastructure – shaping decisions in healthcare, education, industry, and everyday life. Yet current AI systems face a fundamental limitation: they are shaped by narrow operational metrics that fail to reflect the diversity, ambiguity, and richness of human experience. This white paper presents a research vision that positions interpretive depth as essential to building AI systems capable of engaging meaningfully with cultural complexity – while recognising that no technical solution alone can resolve the challenges these systems face in diverse human contexts."
Rein, Patrick, Stefan Ramson, Jens Lincke, Robert Hirschfeld, and Tobias Pape. "Exploratory and live, programming and coding: a literature study comparing perspectives on liveness." arXiv preprint arXiv:1807.08578 (2018). https://arxiv.org/pdf/1807.08578
Various programming tools, languages, and environments give programmers the impression of changing a program while it is running. This experience of liveness has been discussed for over two decades and a broad spectrum of research on this topic exists. This work has been carried out in the communities around three major ideas which incorporate liveness as an important aspect: live programming, exploratory programming, and live coding. While there have been publications on the focus of each particular community, the overall spectrum of liveness across these three communities has not been investigated yet. Thus, we want to delineate the variety of research on liveness. At the same time, we want to investigate overlaps and differences in the values and contributions between the three communities. Therefore, we conducted a literature study with a sample of 212 publications on the terms retrieved from three major indexing services. In delineating the spectrum of work on liveness, we hope to make the individual communities more aware of the work of the others. Further, by giving an overview of the values and methods of the individual communities, we hope to provide researchers new to the field of liveness with an initial overview.
Xu, Feiyu, Hans Uszkoreit, Yangzhou Du, Wei Fan, Dongyan Zhao, and Jun Zhu. "Explainable AI: A brief survey on history, research areas, approaches and challenges." In CCF international conference on natural language processing and Chinese computing, pp. 563-574. Cham: Springer International Publishing, 2019. https://www.researchgate.net/profile/Feiyu-Xu/publication/336131051_Explainable_AI_A_Brief_Survey_on_History_Research_Areas_Approaches_and_Challenges/links/5e2b496f92851c3aadd7bf08/Explainable-AI-A-Brief-Survey-on-History-Research-Areas-Approaches-and-Challenges.pdf
Deep learning has made significant contribution to the recent progress in artificial intelligence. In comparison to traditional machine learning methods such as decision trees and support vector machines, deep learning methods have achieved substantial improvement in various prediction tasks. However, deep neural networks (DNNs) are comparably weak in explaining their inference processes and final results, and they are typically treated as a black-box by both developers and users. Some people even consider DNNs (deep neural networks) in the current stage rather as alchemy, than as real science. In many real-world applications such as business decision, medical diagnosis and investment recommendation, explainability and transparency of our AI systems become particularly essential for their users, for the people who are affected by AI decisions, and furthermore, for the researchers and developers who create the AI solutions. This paper first introduces Explainable AI, starting from expert systems and traditional machine learning approaches to the latest progress in the context of modern deep learning, and then describes the major research areas and the state-ofart approaches in recent years.
Michael Palumbo, Alexander Zonta, Graham Wakefield. "Modular reality: Analogues of patching in immersive space". Journal of New Music Research, DOI: 10.1080/09298215.2019.1706583. Taylor and Francis, 10 Jan 2020. https://www.researchgate.net/publication/338516272_Modular_reality_Analogues_of_patching_in_immersive_space
Despite decades of virtual reality (VR) research, current creative workflows remain far from VR founder Jaron Lanier’s musically inspired dream of collaboratively ‘improvising reality’ from within. Drawing inspiration from modular synthesis as a distinctive musically immersed culture and practice, this article presents a new environment for visual programming within VR that supports live, fine-grained, multi-artist collaboration, through a new framework for operational transformations on graph structures. Although presently focused on audio synthesis, it is articulated as a first step along a path to synthesising worlds.
Ji, Haru Hyunkyung, and Graham Wakefield. "Entanglement: an immersive art of an engagement with non-conscious intelligence." In Proceedings of the International Symposium of Electronic Arts (ISEA), Seoul, Korea. 2025.
This paper describes an artwork combining procedural model- ing, generative AI, and dynamic simulation to create a seamless immersive installation inspired by the motif of the forest and its underground fungal network. The artwork is grounded in the imperative to draw attention to non-conscious cognition, in biological and machine senses, as a reminder of the essential more-than-human-world around us. It addresses these themes by integrating biologically-inspired dynamic simulations with non-narrative spatial storytelling. The paper’s contributions also include challenging the limitations of image-based generative AI in achieving consistency in long-form continuous video at high resolutions while balancing aesthetic control to create a valuable tool within an artist’s original workflow.
Creative Human-AI Agency through Embodied Exploration and Ecological Thinking in XR (Grant draft work in progress)
At rapid rates, the application of AI is transforming nearly every aspect of real-life society with far-reaching cultural implications. Nevertheless, this progression remains severely unbalanced, driven by a technocentric orientation that privileges clear goal-oriented efficiency over qualitative depth and nuanced reflection. Such a trajectory tends to prioritize making AI smarter while pressuring humans to adapt to AI systems, rather than strengthening human capacities and agency. Ultimately, this orientation not only risks subordinating human values to technological imperatives but also diminishes opportunities for human problem-solving experiences, thereby threatening the long-term development and sustainability of diverse cognitive and creative abilities. Within this landscape, the “CHAI4E” project aims to develop alternative human-centered AI designs and practices that support the expansion of human wisdom and agency. This is a fundamental shift that positions arts and humanities at the heart of innovation, emphasizing the qualitative richness of embodied experience through XR, as a foundation for new forms of human-AI co-creation. The proposed grant supports research that advances human-AI co-creation within XR environments through an integrated program comprising prototypes, case studies, and scholarship that reimagines Human–AI interactions within XR. To reorient AI development toward augmenting human capabilities and creating environments in which technology adapts to human needs, rather than the reverse, this project asks three interrelated questions: First, how can XR-based Human–AI systems foster environments that strengthen rather than constrain human agency and learning? Second, what design principles and workflows best ensure interpretive depth and experiential richness in Human–AI co-creation? And third, how creative agency and reflection-in-action can be meaningfully assessed in Human–AI–XR contexts? Through practice-based experiments and case studies, this project aims to generate new theoretical and practical contributions to human–computer interaction, philosophy of creativity, and the arts, that can produce meaningful shifts in creative agency. The impact will extend beyond academia by empowering artists, students, and publics to engage AI not as passive consumers but as active co-creators, fostering broader cultural literacy around human agency in technologically mediated futures.
My PhD: Wakefield, Graham. Real-time meta-programming for interactive computational arts. University of California at Santa Barbara, 2012
In the interactive computer arts, any advance that significantly amplifies or extends the limits and capacities of software can enable genuinely novel aesthetic experiences. Within compute-intensive media arts, flexibility is often sacrificed for needs of efficiency, through the total separation of machine code optimization and run-time execution. Compromises based on modular run-time combinations of prior-optimized 'black box' components confine results to a pre-defined palette with less computational efficiency overall: limiting the open-endedness of development environments and the generative scope of artworks. This dissertation demonstrates how the trade-off between flexibility and efficiency can be relaxed using reflective meta-programming and dynamic compilation: extending a program with new efficient routines while it runs. It promises benefits of more open-ended real-time systems, more complex algorithms, richer media, and ultimately unprecedented aesthetic experiences. The dissertation charts the significant differences that this approach implies for interactive computational arts, builds a conceptual framework of techniques and requirements to respond to its challenges, and documents supporting implementations in two specific scenarios. The first concentrates on open-ended creativity support within always-on authoring environments for studio work and live coding performance, while the second concerns the open-endedness of generative art through interactive, immersive artificial-life worlds.
My Master's thesis: Wakefield, G. "Vessel: A platform for computer music composition, interleaving sample-accurate synthesis and control." Master’s thesis, University of California Santa Barbara, 2007.
The rich new terrains offered by computer music invite the exploration of new techniques to compose within them. The computational nature of the medium has suggested algorithmic approaches to composition in the form of generative musical structure at the note level and above, and audio signal processing at the level of individual samples. In the region between these levels, the domain of microsound, we may wish to investigate the musical potential of sonic particles that interrelate both signal processing and generative structure. In this thesis I present a software platform (‘Vessel’) for the exploration of such potential. In particular, a solution to the efficient scheduling of interleaved sound synthesis and algorithmic control with sample accuracy is expounded. The formal foundations, design and implementation are described, the project is contrasted with existing work, and avenues for musical application and future exploration are proposed.
Homework
Please submit your Reading list, project title and problem statement via eClass here: https://eclass.yorku.ca/mod/assign/view.php?id=3829664
Next, you should be starting to develop this into your Annotated Bibliography
Oct 2, 2023 Class Recording
Research paper work:
Introduction Presentations part II
Sharing your code explorations from our sketches of reproducing and extending Curly/Yellowtail in Week 2
Here's where I got to after cleaning up the code and adding a little more visual refinement:
To complete the Yellowtail reproduction I'd like to add sound -- so that's our next focus topic. But before we jump in -- how would you sonify it?
What is digital audio, computer music, and sound synthesis
https://docs.google.com/presentation/d/1jmVITeEwAtnMNNFXJgHRgKLdmk44TF8D4J01wHLofdA/
A very quick introduction to Max and gen~
Class notes for DATT3074: Creative Generative Audio Signal Processing
Some key concepts and circuits:
noise, cycle, phasor (look at scopemix, paramtriangle shape phasormix and history; cascade them; highpass; * and wrap 0 1)delta, abs, > 0.5latch; source noise (gates?) or a related frequency (melodies?)mtof, exp2* N, floor, / N; neat trick of quantizing twice (second to 12)delay and mixing feedback, modulating timetanh), filtering, etc. in the feedback loopEmbedding our yellowtail in Max:
dist), use readfile message to jweb in MaxEmbedding in Ableton Live?
Exporting via RNBO
The patcher from today's class
Oct 9 Class Recording
1-on-1 feedback on paper progress
Oct 23, 2023 Class Recording
For next week, complete your paper draft. It must be submitted before next week's class begins so that we can run the peer review session.
Catching up unfinished work from a previous class:
Some key concepts and circuits:
mix and history; cascade them; highpass; * and wrap 0 1)delta, abs, > 0.5latch; source noise (gates?) or a related frequency (melodies?)mtof, exp2* N, floor, / N; neat trick of quantizing twice (second to 12)delay and mixing feedback, modulating timetanh), filtering, etc. in the feedback loopEmbedding our yellowtail in Max:
dist), use readfile message to jweb in MaxEmbedding in Ableton Live?
Exporting via RNBO
Alternatively, just send network messages, such as OSC, or MIDI messages, etc.
It was mentioned earlier in this class, as an alternative to a paper, one option for the final submission is an "explorable explanation". What does that mean?
The term is borrowed from a 2011 article by Bret Victor. Here's a couple of quotes that are highly relevant to our goals in this course:
"What does it mean to be an active reader? An active reader asks questions, considers alternatives, questions assumptions, and even questions the trustworthiness of the author. An active reader tries to generalize specific examples, and devise specific examples for generalities. An active reader doesn't passively sponge up information, but uses the author's argument as a springboard for critical thought and deep understanding."
This is great advice for a researcher, and great things to do while annotating a bibliography!
"A typical reading tool, such as a book or website, displays the author's argument, and nothing else. The reader's line of thought remains internal and invisible, vague and speculative. We form questions, but can't answer them. We consider alternatives, but can't explore them. We question assumptions, but can't verify them. And so, in the end, we blindly trust, or blindly don't, and we miss the deep understanding that comes from dialogue and exploration."
Against this he suggests creating "Explorable Explanations":
"The goal is to change people's relationship with text. People currently think of text as information to be consumed. I want text to be used as an environment to think in."
He shows a few examples of how we can embed reactive elements and interactive simulations within a document. This isn't just a novelty:
"It's tempting to be impressed by the novelty of an interactive widget such as this, but the interactivity itself is not really the point. The primary point of this example -- the reason I call it an "explorable explanation" -- is the subtlety with which the explorable is integrated with the explanation."
By interacting with these elements we can verify statements, develop intuition, make discoveries, and explore new questions about the topic.
When we allow the user to interact with the data, it is not only about how the data is displayed, but also about how it behaves, that creates meaning. Interactive animation as exploration: User studies have shown that animation is more effective when knowledge is constructed, i.e. in combination with interaction.
This has inspired a whole ream of projects, including a collection of Explorable Explanations at https://explorabl.es.
Many examples of Explorable Explanations
Another collection, focused on Complex Systems at www.complexity-explorables.org
It has also inspired a recent phenomenon of sharing academic research in interactive form:
Can your research papers be presented in this kind of format?
D3.js is a JavaScript library for manipulating documents based on data. It has been one of the most widely used platforms for online data visualization for over a decade.
To pull in D3, either grab the latest version (https://cdnjs.cloudflare.com/ajax/libs/d3/7.9.0/d3.min.js at the time of writing) and add this to your <head>
tag -- or in Codepen, just search for "d3" in the "Add External Scripts" of the "JS" options.
<head>
<script type="text/javascript" src="https://cdnjs.cloudflare.com/ajax/libs/d3/7.9.0/d3.min.js"></script>
</head>
D3 allows you to bind arbitrary data to a Document Object Model (DOM), and then apply data-driven transformations to the document. For example, you can use D3 to generate an HTML bar chart from an array of numbers:
---https://codepen.io/grrrwaaa/pen/VYeXJPJ
In that example, D3 is working by manipulating the HTML's Document Object Model (DOM), the tree of elements within the page. We're using CSS to style those elements, and javascript to control the number and parameters of those elements according to existing data.
D3 does this by means of a query/attribute system on the DOM. E.g., to change all paragraph text to be white:
d3.selectAll("p").style("color", "white");
Yet styles, attributes, and other properties can be specified as functions of data in D3, not just simple constants. For example, to alternate shades of gray for even and odd nodes:
d3.selectAll("p").style("color", function(d, i) {
return i % 2 ? "#fff" : "#eee";
});
Despite their apparent simplicity, these functions can be surprisingly powerful.
Computed properties often refer to bound data. Data is specified as an array of values, and each value is passed as the first argument (d) to selection functions.
d3.selectAll("p")
.data([4, 8, 15, 16, 23, 42])
.style("font-size", function(d) { return d + "px"; });
The data() method maps each element of the array to each DOM node in the selection.
Instead of generating data in JavaScript, it can be loaded from local files with d3.tsv(), d3.csv(), d3.json(), d3.xhr(), d3.text() etc.
(The strange dot syntax is called "chaining": when a function returns an object, we can call methods on that object in turn.)
Don't worry about matching the array length: using D3’s enter(), you can create new nodes for incoming data:
d3.select("body").selectAll("p")
.data([4, 8, 15, 16, 23, 42])
.enter()
.append("p")
.html(function(d) { return "I’m number " + d + "!"; });
D3 lets you transform documents based on data; this includes both creating (enter) and destroying (exit) elements. D3 allows you to change an existing document in response to user interaction, animation over time, or even asynchronous notification from a third-party. D3 is easy to debug using the browser’s built-in element inspector: the nodes that you manipulate with D3 are exactly those that the browser understands natively.
In addition to using standard HTML and CSS, D3 allows you to use another web standard, SVG (scalable vector graphics) for drawing graphical representations of data in the DOM. You can create SVG elements using D3 and style them with external stylesheets. SVG can be embedded in HTML pages just like any other tag, e.g. a blue circle:
<svg width="50" height="50">
<circle cx="25" cy="25" r="22"
fill="blue"/>
</svg>
Whereas Canvas drawing is all pixel based, SVG drawing is vector based. Moreover, every element in and SVG is part of the DOM, so it can be manipulated dynamically by D3.js. Here's an example SVG:
<svg id="mysvg" width=500 height=500>
<rect x="0" y="0" width="500" height="10" />
<circle cx="250" cy="20" r="5" fill="yellow" stroke="orange" stroke-width="2"/>
<g transform="translate(250,30)">
<ellipse cx="0" cy="0" rx="10" ry="5" class="pumpkin"/>
</g>
<line x1="0" y1="40" x2="500" y2="50" stroke="black"/>
<text x="250" y="60">Easy-peasy</text>
</svg>
SVG is always wrapped in an <svg> element, which should have a width & height (graphics will be clipped to this box). Within that, SVG code itself is a form of XML. Simple SVG shapes include rect, circle, ellipse, line, text, and path. The coordinate system is pixel based, with 0,0 at the top left. Common SVG properties are: fill (CSS color), stroke (CSS color), stroke-width, opacity (0.0 is transparent, 1.0 is opaque). These can all be set with CSS styles. All text will inherit the CSS-specified font styles of its parent element unless specified otherwise via CSS.
let svg = d3.select("#mysvg");
svg
.selectAll("circle")
.data([4, 8, 15, 16, 23, 42])
.enter()
.append("circle")
.attr("fill", "blue")
.attr("cy", 50);
.attr("r", (d, i) => d)
D3’s focus on transformation extends naturally to animated transitions. Transitions gradually interpolate styles and attributes over time. For example, to resize circles in a symbol map with a staggered delay:
svg
.selectAll("circle")
.transition()
.duration(750)
.delay(function (d, i) {
return i * 40;
})
.attr("cx", function (d, i) {
return (i + 1) * 50;
});
D3 can also very easily, and powerfully, animate transitions. A great in-browser demo here.
With dynamically changing data of varying length, we often need to specify how items appear, update, and disappear. This is the common enter/update/exit pattern:
---https://codepen.io/grrrwaaa/pen/VYeXJqL
When using transitions and dynamically updated data, it is very important to pass a second key argument to the data() call; this key is a function that returns the unique identifier of a given data record; that way D3 knows which records to animate when the data changes. The example above used the letter value itself as the unique identifier key.
The data array does not need to be simply an array of numbers; it can be an array of objects. Each one of those objects will be passed to the attr() handlers for each item. It therefore makes a whole lot of sense to prepare and annotate this array of objects before passing to D3 rendering. If each item is an object, we can store the unique identifier in this object.
Scales are functions that map from an input domain to an output range. Since data is unlikely to be in the same range as the svg pixels, a scale function can be used to provide the transformation from an input domain of fahrenheit to an output range of celsius:
var scale = d3.scale.linear()
.domain([0, 100]) // fahrenheit
.range([-17.7778, 37.7778]); // celsius
// .clamp(true)
// .nice()
scale(32); //returns 0
scale(100); //returns 37.7778
scale(212); //returns 100
Other scale types include pow, log, quantize, quantile, and ordinal, and d3.time.scale too.
If the data is large, or came from a different data provider, it is probably in a separate JSON, CSV, or other external file. The D3 library has many methods for loading and parsing external data
In the first chapter of the Visualizing Data book, Ben Fry sets up the Data Visualization process as a series of steps:
(prepare data)
Obtain the data, whether from a file on a disk or a source over a network.
Provide some structure for the data's meaning, and order it into categories. Remove all but the data of interest.
Apply methods from statistics or data mining as a way to discern patterns or place the data in mathematical context.
(visualise)
Choose a basic visual model, such as a bar graph, list, or tree.
Improve the basic representation to make it clearer and more visually engaging.
Add methods for manipulating the data or controlling what features are visible.
(publish)
Tim Berners-Lee (www founder) TED talk
Do you know exactly how much of your tax money is spent on street lights or on cancer research? What is the shortest, safest and most scenic bicycle route from your home to your work? And what is in the air that you breathe along the way? Where in your region will you find the best job opportunities and the highest number of fruit trees per capita? When can you influence decisions about topics you deeply care about, and whom should you talk to?
New technologies now make it possible to build the services to answer these questions automatically. Much of the data you would need to answer these questions is generated by public bodies. However, often the data required is not yet available in a form which is easy to use. This book is about how to unlock the potential of official and other information to enable new services, to improve the lives of citizens and to make government and society work better.
The notion of open data and specifically open government data - information, public or otherwise, which anyone is free to access and re-use for any purpose - has been around for some years. In 2009 open data started to become visible in the mainstream, with various governments (such as the USA, UK, Canada and New Zealand) announcing new initiatives towards opening up their public information.
Open Data may come in the form of a whole static database (CSV, EXCEL, TXT etc.), or it may be served as an API. An API will require some kind of request structure, such as location for a weather report, and should describe the structure of the response to expect. The open data documentation should also explain whether it includes geospatial information, and how frequently it is refreshed (if appropriate).
Most major online services have an API, and many allow you to acquire data surrounding your activity. If you have a Google site, Android/iOS app, or even a Unity game, you might already be collecting data via Google Analytics. If you host a site or code repository at Github, they have some great APIs you can use. Similarly for accessing your Facebook data. Look further afield-- even your bank might have an API you can use, or offer you the option to download transaction histories as a static database. You can probably browse your phone's location history here
Some examples:
Weather -- free sign up for API key. See example here
For example, the Bike Share database is updated in near real-time.
Unfortunately, like many open data resources, you can't just load this in D3 using d3.json(url) from a Codepen webpage because of CORS (Cross-Origin Resource Sharing) -- it becomes a security risk. But you can do this from a server, even a server running locally on your own machine using Node.js for example.
For security, browsers typically do not allow a website on one domain to dynamically pull in data from another domain; i.e. they typically apply a same-domain policy. Fortunately, in the case of XMLHttpRequests, the provider may explicitly allow CORS, as is the case for http://api.openweathermap.org. Moreover, most dynamic requests will fail when running the HTML file from a local filesystem. They need to be running from a server.
Node.js lets us write complex server applications, but it also provides a simple way to run a server from any location on your filesystem. First, install this capability on your computer by typing this in your terminal (you'll have to make sure node.js is installed first of course, see above):
npm install -g http-server
Once installed, you can run this from any location in your terminal like this:
http-server
And you can then open this in your browser at address http://0.0.0.0:8080/
*If you are working in Max, you can also access these APIs via the maxurl object, or you can run a full-fledge Node.js application via the node.script object.
Cleaning data
Clean and great data is essential for good visualisations. Most online available data is not clean, not normalized, not well-structured, because life is not clean, normalized or well-structured. Good data is easily machine-readable, with semantic notes easily human-readable. Data is normalized, gaps are meaningfully handled, and noise reduced. This may mean:
It may also imply some "mining" or analysis passes to generate more useful field values
Further reading
Web mining is defined as the use of data mining, text mining, and information retrieval techniques to extract useful patterns and knowledge from the Web. A mashup, in web development, is a web page, or web application, that uses content from more than one source to create a single new service displayed in a single graphical interface. The term implies easy, fast integration, frequently using open application programming interfaces (open API) and data sources to produce enriched results that were not necessarily the original reason for producing the raw source data. The main characteristics of a mashup are combination, visualization, and aggregation.
Correlation is not causation. A wonderful example of spurious correlations.
For next week, complete your paper draft. It must be submitted before next week's class begins so that we can run the peer review session.
The paper should be prepared as typically required for a conference submission (see the Call for Papers).
Normally this means uploading a PDF to an online form. (Explorable explanations etc: If your submission is an online-first paper, please just submit a link to this online paper. In other terms, please follow these guidelines as closely as possible.)
The article should contain:
Formatting: the article should be single-spaced with a text font size of 9, in a serif font (e.g. "Times"). If you want to use a standard conference template (highly recommended), here are
Images: You can use images, but make sure that all images have captions, and any text in the images is no smaller than the text in the paper. Frequently this is limited to about 1 image per page. In a conference submission you must identify how you have publishing rights to any images used, e.g. "images by author" or "image by {creator}, used by permission" or "by Creative Commons CC BY-NC-SA 2.0" etc.
Supplementary files: Frequently a conference will allow a submission to include a limited number of supplementary files, such as additional images, video files, sound files, which reviewers can use to evaluate the work.
Length: Conference papers can be 3-8 pages in length, but 4 pages is the most typical. At approximately 700 words/page, this means a typical word count is around 2000-3000 words.
Submit your paper via e-Class here
Oct 30, 2023 Class Recording
What qualifies as good research? One way of knowing this is to look at how research is reviewed. Again, most grant appplications, but also journals and conference review bodies often publish guidelines for reviewers. These are the criteria by which your work will be evaluated.
Guidelines from
Here are key points:
"The purpose of peer review is to improve the quality of the manuscript under review, and of the material that is eventually published. Conscientious peer review is a time-consuming task but is essential to assure the quality of scientific journals."
Reviews should be conducted fairly and objectively. Personal criticism of the author is inappropriate. If the research reported in the manuscript is flawed, criticize the science, not the scientist. Criticisms should be objective, not merely differences of opinion, and intended to help the author improve his or her paper.
Comments should be constructive and designed to enhance the manuscript. You should consider yourself the authors’ mentor. Make your comments as complete and detailed as possible. Express your views clearly with supporting arguments and references as necessary. Include clear opinions about the strengths, weaknesses and relevance of the manuscript, its originality and its importance to the field. Specific comments that cite line numbers are most helpful.
Begin by identifying the major contributions of the paper. What are its major strengths and weaknesses, and its suitability for publication? Please include both general and specific comments bearing on these questions, and emphasize your most significant points. Support your general comments, positive or negative, with specific evidence.
Is the aim clearly stated? Do the title, abstract, key words, introduction, and conclusions accurately and consistently reflect the major point(s) of the paper? Is the writing concise, easy to follow, and interesting, without repetition?
Are the methods appropriate, sound, current, and described clearly enough that the work could be repeated by someone else? Is the research ethical and have the appropriate approvals/consent been obtained? Are appropriate analyses used? Are they sufficiently justified and explained? Are statements of significance justified? Are results supported by data? Are any of . the results counterintuitive? Are the conclusions supported by the data presented?
Are the references cited the most appropriate to support the manuscript? Are citations provided for all assertions of fact not supported by the data in this paper? Are any key citations missing?
Should any portions of the paper should be expanded, condensed, combined, or deleted?
Do not upload any part of a submitted paper to a cloud service, such as a grammar checker or AI tool; nor should you share it with anyone else.
These are general guidelines, but practices and cultures of value can differ very greatly between different research communities -- and we are often transdiscplinary...
We are running an 'internal review' process, emulating what is frequently done in conference submission review processes. Typically this means:
For our purposes, you will act as both author and reviewer: each of you will act as reviewers for the other students' submissions.
As a structure, our review is based on materials as used by the SIGGRAPH Art Papers review body. Each of you will be randomly assigned up to 3 papers to review.
Nov 6, 2023 Class Recording
Why GPU programming?
An example of the power of this technique: https://www.shadertoy.com/view/XsBXWt Notice that, apart from the cat gif, everything else in this example is from around 200 lines of code. It runs at a high frame rate, even when in full screen. This is the kind of thing that is lauded in the "demoscene" world.
The language this is written in is GLSL. It is a way to write programs that will run directly on your GPU. GLSL can be used in the web like on ShaderToy, or in Three.js, or basically any web page in a modern browser -- even when opened on your phone or a VR headset like the Quest 3. GLSL is also used in desktop OpenGL envionments, including TouchDesigner, Max/MSP/Jitter, Ossia, Hydra, and so on. It can also be used in Unity or Unreal, though they prefer you to use a more abstract language (HLSL) which then translates to GLSL.
There are several kinds of shaders:
Today we'll be looking at fragment shaders.
We'll use ShaderToy for convenience; but you should now that the required code to set up a shader in a webpage is not that complex -- it can be done in around 100 lines of code.
The fragment shader is a program that runs separately for each fragment (think of it as a pixel). The main output of the fragment shader is a pixel colour, as a vec4 representing red, green, blue and alpha (opacity) components, between 0 and 1.
Sample code:
void mainImage(out vec4 fragColor, in vec2 fragCoord) {
vec4 yellow = vec4(1, 1, 0, 1); // red, green, blue, alpha
fragColor = yellow;
}
For the most part, GLSL here looks a lot like C, Java, and similar procedural, typed languages. You can think of the main function here as defining a program that runs per pixel (per fragment actually) of the output image. In this case, we set all pixels to a single color.
One slightly unusual feature is the out keyword: a function can have arguments that you can modify. In this case, the output pixel color.
The output pixel is a vec4, which means it has four values, for Red, Green, Blue, and Alpha (opacity). vec4 is a built in type in GLSL, along with vec2 and vec3.
There are a few slightly idisyncratic GLSL language features of vectors. You can index their components in a few different ways, including swizzling (re-ordering) them,
vec4 v = vec4(1, 0.5, 0.2, 0);
// these two are the same:
fragColor = vec4(v.x, v.y, v.z, v.w);
fragColor = v.xyzw;
// these two are the same:
fragColor = vec4(v.w, v.z, v.y, v.x);
fragColor = v.wxyz;
// these two are the same:
fragColor = vec4(v.x, v.x, v.x, v.x);
fragColor = v.xxxx;
// .r .g .b .a == .x .y .z .w
// these two are the same:
fragColor = v.xxxx;
fragColor = v.rrrr;
// compound a vec4 from vec3's, vec2's, and floats:
// these two are the same:
fragColor = vec4(v.xy, v.z, 1);
fragColor = vec4(v.rgb, 1);
// we can also create a vec4 from a single float like this:
// these two are the same:
fragColor = vec4(1, 1, 1, 1);
fragColor = vec4(1);
The vec2 fragCoord argument is the pixel location in integer pixel numbers, starting at the bottom-left. To turn that into a normalized coordinate, that goes from 0,0 at the bottom left, to 1,1 at the top right, we can divide by the image resolution. Shadertoy gives us the image resolution in the variable iResolution.xy.
// Normalized pixel coordinates (from 0 to 1)
vec2 uv = fragCoord/iResolution.xy;
// visualize X coordinate in red, Y coordinate in green:
fragColor = vec4(uv, 0, 1);
If we wanted a signed normalized coordinate, from -1 to +1, with 0,0 in the image center, we can do this:
// signed normalized pixel coordinates (from -1 to 1)
vec2 suv = uv*2.0 - 1.0;
// to take into account aspect ratio:
suv.x *= iResolution.x / iResolution.y;
So now we can use the normalized coordinate to make a pattern over space. Essentially here we are defining a field function, that maps a vec2 position into a vec4 color.
For example, here's a repeating sinusoidal surface:
const float PI = 3.141592653589793;
vec2 grid = cos(10.0 * PI * suv);
fragColor = vec4(grid, 0, 1);
Notice how the cos function is quite happy to accept a vec2 and produce a vec2 result. This is true for most math functions in GLSL.
Or we consider the pixel's distance from the center:
vec2 centre = vec2(0, 0);
float dist = distance(suv, centre);
// equivalent: length(suv - centre);
fragColor = vec4(dist);
To draw a point, a common approach here is to use an exponential decay of distance via exp(-sharpness * dist):
float sharpness = 50.0;
float spot = exp(-sharpness * dist);
fragColor = vec4(spot);
What we are doing is drawing a rapid falloff on the distance from a point. We can also turn this into a distance-from-circle, simply by subtracting the circle's radius from the distance.
vec2 centre = vec2(0, 0);
float radius = 0.2;
float dist = distance(suv, centre) - radius;
float sharpness = 50.0;
float spot = exp(-sharpness * dist);
fragColor = vec4(spot);
Or to draw several, we can use a modulo operation to divide up the space:
vec2 pos = mod(uv * 5.0, 1.0);
float dist = length(pos - 0.5);
float smoothResult = smoothstep(0.5, 0.46, dist);
fragColor = vec4(pos.xy, 1, 1) * smoothResult;
It's a squashed looking circle because we are working in normalized coordinates, and the canvas is not square. We could instead do this in pixel coordinates:
vec2 centre = vec2(400, 400);
float radius = 100.0;
float dist = distance(fragCoord, centre) - radius;
float sharpness = 50.0;
float spot = exp(-sharpness * dist);
fragColor = vec4(spot);
Or we could adjust for aspect ratio:
// to take into account aspect ratio:
suv.x *= iResolution.x / iResolution.y;
``
Notice how odd this is: we are drawing shapes (points, circles) not by geometry, but by specifying a function of a field. We didn't trace a line, we didn't do any geometry really, we just defined a function of space that maps a 2D position into a color, using only the principle of *signed distance*. This method of drawing by 'distance function' can be surprisingly powerful, and we'll return to it later.
We used the `iResolution` uniform before to get the canvas size. (The "Uniform" terminology here really means an input parameter to the shader. It is "uniform" because the parameter has the same value for all pixels.) Shadertoy also gives us a few more uniforms to play with:
```glsl
uniform vec3 iResolution; // viewport resolution (in pixels)
uniform float iTime; // shader playback time (in seconds)
uniform float iTimeDelta; // render time (in seconds)
uniform float iFrameRate; // shader frame rate
uniform int iFrame; // shader playback frame
uniform float iChannelTime[4]; // channel playback time (in seconds)
uniform vec3 iChannelResolution[4]; // channel resolution (in pixels)
uniform vec4 iMouse; // mouse pixel coords. xy: current (if MLB down), zw: click
uniform samplerXX iChannel0..3; // input channel. XX = 2D/Cube
uniform vec4 iDate; // (year, month, day, time in seconds)
So for example, we can use iMouse.xy to move the circle, and iTime to change its size:
vec2 centre = iMouse.xy;
float radius = 100. * abs(sin(iTime));
There's a lot you can do with math to procedurally generate images as functions of space (and time). Here's a more colourful example of a field varying in time:
// Normalized pixel coordinates (from 0 to 1)
vec2 uv = fragCoord/iResolution.xy;
// Time varying pixel color
vec3 col = 0.5 + 0.5*cos(iTime + uv.xyx+vec3(0,2,4));
// Output to screen
fragColor = vec4(col, 1.0);
One thing GLSL doesn't provide is a noise or random number generator. Some people have worked around this by finding mathematical functions that are pseudo-random -- noisy enough and cheap enough for many simple use cases.
This is generic library code -- you can put this directly into the top of your shader, or in Shadertoy you can click the + to add a "Common" tab, in which you can place library code like this that will be visible to all shaders.
#define RANDOM_SCALE vec4(.1031, .1030, .0973, .1099)
vec2 random2(float p) {
vec3 p3 = fract(vec3(p) * RANDOM_SCALE.xyz);
p3 += dot(p3, p3.yzx + 19.19);
return fract((p3.xx + p3.yz) * p3.zy);
}
vec2 random2(vec2 p) {
vec3 p3 = fract(p.xyx * RANDOM_SCALE.xyz);
p3 += dot(p3, p3.yzx + 19.19);
return fract((p3.xx + p3.yz) * p3.zy);
}
vec2 random2(vec3 p3) {
p3 = fract(p3 * RANDOM_SCALE.xyz);
p3 += dot(p3, p3.yzx + 19.19);
return fract((p3.xx + p3.yz) * p3.zy);
}
vec3 random3(float p) {
vec3 p3 = fract(vec3(p) * RANDOM_SCALE.xyz);
p3 += dot(p3, p3.yzx + 19.19);
return fract((p3.xxy + p3.yzz) * p3.zyx);
}
vec3 random3(vec2 p) {
vec3 p3 = fract(vec3(p.xyx) * RANDOM_SCALE.xyz);
p3 += dot(p3, p3.yxz + 19.19);
return fract((p3.xxy + p3.yzz) * p3.zyx);
}
vec3 random3(vec3 p) {
p = fract(p * RANDOM_SCALE.xyz);
p += dot(p, p.yxz + 19.19);
return fract((p.xxy + p.yzz) * p.zyx);
}
vec4 random4(float p) {
vec4 p4 = fract(p * RANDOM_SCALE);
p4 += dot(p4, p4.wzxy + 19.19);
return fract((p4.xxyz + p4.yzzw) * p4.zywx);
}
vec4 random4(vec2 p) {
vec4 p4 = fract(p.xyxy * RANDOM_SCALE);
p4 += dot(p4, p4.wzxy + 19.19);
return fract((p4.xxyz + p4.yzzw) * p4.zywx);
}
vec4 random4(vec3 p) {
vec4 p4 = fract(p.xyzx * RANDOM_SCALE);
p4 += dot(p4, p4.wzxy + 19.19);
return fract((p4.xxyz + p4.yzzw) * p4.zywx);
}
vec4 random4(vec4 p4) {
p4 = fract(p4 * RANDOM_SCALE);
p4 += dot(p4, p4.wzxy + 19.19);
return fract((p4.xxyz + p4.yzzw) * p4.zywx);
}
Try out a quick example:
vec4 noise = random4(vec3(fragCoord.xy, iTime));
fragColor = vec4(noise);
Note that this is not a very good pseudo-random generator, and sometimes you will see patterns. Better generators are more expensive. Here is a good example: https://www.shadertoy.com/view/ftsfDf
We can also pull in external images into a shader to process them, including videos, webcam streams, and so on. Click on the iChannel0 box under the editor and choose an image or stream to use. We can then access this using the texture function:
vec4 image = texture(iChannel0, uv);
fragColor = image;
So now we can do all kinds of math on that image for classic webcam effects:
// invert
fragColor = 1.-image;
// recolor:
fragColor = image.gbra;
// a kind of saturation:
fragColor = smoothstep(0., 1., image);
// a kind of saturation:
fragColor = smoothstep(0.4, 0.6, image);
// simple greyscale:
fragColor = image.ggga;
// threshold:
fragColor = smoothstep(0.4, 0.41, image.ggga);
// brightness:
fragColor = pow(image, vec4(sin(iTime)+1.5));
Some more library code for common image manipulations:
vec3 desaturate(in vec3 v, in float a ) {
return mix(v, vec3(dot(vec3(.3, .59, .11), v)), a);
}
vec4 desaturate(in vec4 v, in float a ) { return vec4(desaturate(v.rgb, a), v.a); }
float brightnessContrast( float v, float b, float c ) { return ( v - 0.5 ) * c + 0.5 + b; }
vec3 brightnessContrast( vec3 v, float b, float c ) { return ( v - 0.5 ) * c + 0.5 + b; }
vec4 brightnessContrast( vec4 v, float b, float c ) { return vec4(( v.rgb - 0.5 ) * c + 0.5 + b, v.a); }
float rgb2luma(const in vec3 rgb) { return dot(rgb, vec3(0.2126, 0.7152, 0.0722)); }
float rgb2luma(const in vec4 rgb) { return rgb2luma(rgb.rgb); }
vec3 hue2rgb(const in float hue) {
float R = abs(hue * 6.0 - 3.0) - 1.0;
float G = 2.0 - abs(hue * 6.0 - 2.0);
float B = 2.0 - abs(hue * 6.0 - 4.0);
return clamp(vec3(R,G,B), 0., 1.);
}
vec3 hsv2rgb(const in vec3 hsv) { return ((hue2rgb(hsv.x) - 1.0) * hsv.y + 1.0) * hsv.z; }
vec4 hsv2rgb(const in vec4 hsv) { return vec4(hsv2rgb(hsv.rgb), hsv.a); }
vec3 rgb2hsv(const in vec3 c) {
vec4 K = vec4(0., -0.33333333333333333333, 0.6666666666666666666, -1.0);
vec4 p = c.g < c.b ? vec4(c.bg, K.wz) : vec4(c.gb, K.xy);
vec4 q = c.r < p.x ? vec4(p.xyw, c.r) : vec4(c.r, p.yzx);
float d = q.x - min(q.w, q.y);
return vec3(abs(q.z + (q.w - q.y) / (6. * d + 1e-10)),
d / (q.x + 1e-10),
q.x);
}
vec4 rgb2hsv(const in vec4 c) { return vec4(rgb2hsv(c.rgb), c.a); }
(see more at https://github.com/patriciogonzalezvivo/lygia -- for example, pretty much all the photoshop layer modes are at https://github.com/patriciogonzalezvivo/lygia/blob/main/color/layer.glsl)
Obviously, some of these image effects can also use the coordinate to transform them, to create for example vignette effects.
fragColor *= exp(-length(suv));
The texture function needs the specific "sampler" input to sample from (in this case, iChannel0 which Shadertoy provides), as well as a vec2 normalized coordinate for where in the image to sample it. That means of course, we can sample from different places, not only the current location!
vec2 coord = 0.5 + (suv)*sin(iTime);
//vec2 coord = 0.5 + (suv)*exp(-length(suv));
//vec2 coord = 0.5 + (suv)*exp(sin(iTime)*length(suv));
//vec2 coord = 0.5 + 0.5*mix(suv, suv*sin(iTime), 1.-length(suv));
//vec2 coord = uv + 0.1*(noise.xy-0.5)*length(suv); // a little noise can be a bit like a blur
vec4 image = texture(iChannel0, coord);
``
This can get pretty complex: https://www.shadertoy.com/view/
We can also use this to do things like comparing or blending nearest pixels. This is a common type of image effect that includes blur, sharpen, erode, edge highlight, etc. These are called [convolution filters](https://en.wikipedia.org/wiki/Kernel_(image_processing)). Convolution simply means multiplying several pairs of terms together and summing the results. In image processing, this is usually means multiplying a square (or rectangular) region of an image with a "kernel" matrix.
First, we define a kernel for the relative weights of the neighboring pixels. Then we loop over these pixels, sampling the image at each point, and multiplying it with the corresponding kernel weight, summing up the results.
```glsl
// some example kernels:
mat3 identity = mat3(
0, 0, 0,
0, 1, 0,
0, 0, 0,
);
mat3 edge0 = mat3(
1, 0, -1,
0, 0, 0,
-1, 0, 1,
);
mat3 edge1 = mat3(
0, -1, 0,
-1, 4, -1,
0, -1, 0
);
mat3 edge2 = mat3(
-1, -1, -1,
-1, 8, -1,
-1, -1, -1
);
mat3 sharpen = mat3(
0, -1, 0,
-1, 5, -1,
0, -1, 0
);
mat3 emboss = mat3(
-2, -1, 0,
-1, 1, 1,
0, 1, 2
);
mat3 boxBlur = mat3(
1, 1, 1,
1, 1, 1,
1, 1, 1
) * 1.0/9.0;
mat3 gaussBlur = mat3(
1, 2, 1,
2, 4, 2,
1, 2, 1
) * 1.0/16.0;
kernel = identity;
vec2 oneTexel = 1./iResolution.xy;
// loop over a 3x3 region, summing results:
vec4 sum = vec4(0.0);
for (int i = -1; i <= 1; i++) {
for (int j = -1; j <= 1; j++) {
// get the texture coordinate offset for this texel:
vec2 offset = vec2(float(i), float(j)) * oneTexel;
// get the image at this texel:
vec4 pixelColor = texture(iChannel0, uv + offset);
// Apply kernel weight and sum:
sum += pixelColor * kernel[i+1][j+1];
}
}
fragColor = sum;
There are some other spatial image processes that are similar to convolution, but not using summation (so they are not strictly convolution), which you could explore:
We can also use mat objects to perform spatial transformations of the image. Here's a rotation matrix:
mat2 rotateMat2(float angle) {
float c = cos(angle);
float s = sin(angle);
return mat2(
c, s,
-s, c
);
}
If we apply this to our uv coordinate, we can rotate the image:
uv = rotate(iTime) * uv;
We can also scale using a mat2:
mat2 scaleMat2(float scale) {
return mat2(
s, s,
s, s
);
}
If we wanted to translate however, we need to use mat3. The idea is simple: we assume that there is a 3rd coordinate to the input vector, equvalent to uv3 = vec3(uv, 1), so that we can then multiply this with the mat3. Then our transforms look like this:
mat3 translateMat3(x, y) {
mat3(
1, 1, 0, // First column (accessed as m[0])
0, 1, 0, // Second column (accessed as m[1])
x, y, 1 // Third column (accessed as m[2])
);
}
mat3 rotateMat3(float angle) {
float c = cos(angle);
float s = sin(angle);
return mat2(
c, s, 0,
-s, c, 0,
0, 0, 1
);
}
mat3 scaleMat3(float s) {
return mat2(
s, s, 0,
s, s, 0,
0, 0, 1
);
}
With these we can create quite complex transformations:
// convert to a vec3:
uv3 = vec3(uv, 1.);
// apply several transformations:
uv3 = translateMat3(-0.5) * scaleMat3(sin(iTime)) * rotateMat3(iTime) * translateMat3(0.5) * uv3;
// convert back to vec2:
uv = uv3.xy;
So far we are processing the image over value (color), and over space. But we can also process it over time. To do that, we need to set up a feedback loop.
For example, what if we wanted to apply a feedback blur that is also creating spiral trails?
In shadertoy we can do this by adding a "Buffer" stage. Again, use the + button, and select "Buffer A". Now in the Buffer A tab, let's set up iChannel0 input to also be "Buffer A", so that it can read its own previous frame.
In the Image tab, which defines what we actually see, let's also set up iChannel0 input to also be "Buffer A", and display it:
// in Image tab, show the Buffer A content from iChannel0
vec2 uv = fragCoord/iResolution.xy;
fragColor = texture(iChannel0, uv);
Back in the Buffer A tab, first let's set it up to display its own last frame:
vec2 uv = fragCoord/iResolution.xy;
fragColor = texture(iChannel0, uv);
Now we can add something to this to see the feedback:
vec4 noise = random4(vec3(fragCoord.xy, iTime));
// add a white dot if the noise function is >= 0.999:
fragColor = fragColor + vec4(step(0.999, noise.x));
This will gradually fill up the image. We can also let the image decay:
vec4 noise = random4(vec3(fragCoord.xy, iTime));
float decay = 0.99;
fragColor = fragColor*decay + vec4(step(0.999, noise.x));
And for something more intersting, intead of feeding back the same pixel, we could read from the pixel above it:
vec2 uv = fragCoord/iResolution.xy;
fragColor = texture(iChannel0, uv + vec2(0., 0.01));
Another common pattern here is to set up an initialization on the first frame, by using iFrame == 0, and the Rewind button on the shader view to reset this to zero:
vec2 uv = fragCoord/iResolution.xy;
fragColor = texture(iChannel0, uv + vec2(0., 0.01));
vec4 noise = random4(vec3(fragCoord.xy, iTime));
// initialize:
if (iFrame == 0) {
fragColor = noise;
}
Notice it blurring over time? That's because we are using linear interpolation on the iChannel0 settings. Change the filter to "nearest" and it will not blur.
Try doing some spatial transforms on the image in the feedback loop!
Feedback is also essential for making simulations of complex systems.
We now have enough to write a cellular automaton, such as the Game of Life:
void mainImage( out vec4 fragColor, in vec2 fragCoord )
{
// get self state
vec4 C = texture(iChannel0, (fragCoord+vec2( 0, 0))/iResolution.xy);
// am I alive?
int alive = int(C.x > 0.5);
// get state of all neighbour pixels:
vec4 E = texture(iChannel0, (fragCoord+vec2( 1, 0))/iResolution.xy);
vec4 W = texture(iChannel0, (fragCoord+vec2(-1, 0))/iResolution.xy);
vec4 N = texture(iChannel0, (fragCoord+vec2( 0, 1))/iResolution.xy);
vec4 NE = texture(iChannel0, (fragCoord+vec2( 1, 1))/iResolution.xy);
vec4 NW = texture(iChannel0, (fragCoord+vec2(-1, 1))/iResolution.xy);
vec4 S = texture(iChannel0, (fragCoord+vec2( 0,-1))/iResolution.xy);
vec4 SE = texture(iChannel0, (fragCoord+vec2( 1,-1))/iResolution.xy);
vec4 SW = texture(iChannel0, (fragCoord+vec2(-1,-1))/iResolution.xy);
// count number of living neighbours:
int neighbours = int(E.x > 0.5) + int(W.x > 0.5)
+ int(NE.x > 0.5) + int(NW.x > 0.5)
+ int(SE.x > 0.5) + int(SW.x > 0.5)
+ int(N.x > 0.5) + int(S.x > 0.5);
// should I live on?
int liveon = alive;
// the rules (see https://en.wikipedia.org/wiki/Conway%27s_Game_of_Life)
if (alive == 1) {
// die by loneliness or overcrowding:
if (neighbours < 2 || neighbours > 3) liveon = 0;
} else {
// birth:
if (neighbours == 3) liveon = 1;
}
// update my state:
fragColor = vec4(float(liveon));
// or for a more colourful variant:
// fragColor = vec4(liveon, int(alive != liveon), alive, 1);
vec4 noise = random4(fragCoord.xy);
// initialize:
if (iFrame == 0) {
fragColor = vec4(step(0.8, noise.x));
}
// add some noise near the mouse:
if (iMouse.z > 0.0) {
// if the mouse is held, randomize some pixels near the mouse
if (distance(fragCoord, iMouse.xy) < 10.0) {
fragColor = vec4(step(0.8, noise.x));
}
}
}
Now let's try something completely different.
Ray tracing is a rendering technique for generating an image by tracing the path of light as pixels in an image plane and simulating the effects of its encounters with virtual objects — Wikipedia
First, for each pixel in the image, we need a 3D ray. A ray is a line with an origin and direction. We can build these like this:
vec3 camera_pos = vec3(0, 0, 0);
vec3 camera_dir = normalize(vec3(suv.xy, 7));
We can put a simple object, such as a sphere, into this space. A sphere has a centre and radius:
vec3 sphere_pos = vec3(1, 0, 20);
float sphere_rad = 2.0;
// to make our life easier, let's combine this into a vec4:
vec4 sphere = vec4(sphere_pos, sphere_rad);
// let's also define a light position:
vec3 light_pos = vec3(8, 4, 10);
Now we need a function to test whether a given ray intersects with a sphere. The explanation of this math is a bit beyond what we can cover here, but have a look here
// returns distance to first intersection with the sphere from the ray:
// returns -1 if the ray does not intersect with the sphere
float intersectSphere(vec3 rayOrigin, vec3 rayDirection, vec3 sphereCenter, float sphereRadius) {
vec3 L = sphereCenter - rayOrigin;
float tca = dot(L, rayDirection);
float d2 = dot(L, L) - tca * tca;
float radius2 = sphereRadius * sphereRadius;
if (d2 > radius2) return -1.0; // No intersection
float thc = sqrt(radius2 - d2);
float t0 = tca - thc;
float t1 = tca + thc;
if (t0 < 0.0 && t1 < 0.0) return -1.0; // Both intersections behind ray origin
if (t0 < 0.0) return t1; // Ray origin inside sphere, return far intersection
return t0; // Return closest intersection
}
Can we see it?
float d = intersectSphere(camera_pos, camera_dir, sphere_pos, sphere_rad);
if (d > 0.) {
fragColor = vec4(1);
}
To begin to light this sphere, we need to know where exactly our intersection point is, and from that we can determine the normal, which is to say, the direction pointing perpendicularly away from the sphere's surface:
// move the right distance along the ray to find the point:
vec3 pt = camera_pos + d*camera_dir;
// a sphere's normal is simple, it always points away from the sphere center
// we normalize it to ensure it has a length of 1 (a unit vector)
// this only gives direction, and is useful in the math later
vec3 normal = normalize(pt - sphere_pos);
We can do diffuse lighting relative to a particular light direction (for sunlight), or by deriving a light direction from the relative positions of the sphere and a light source:
// again, normalize it to get a unit length direction vector:
vec3 light_dir = normalize(pt - light_pos);
// similarity of light and ray:
float diffuse = max(dot(normal, light_dir), 0.);
// similarlity with ray reflection vector:
float specular = max(dot(-camera_dir, reflect(-light_dir, normal)), 0.);
fragColor = vec4(specular);
To continue:
TODO: A different approach, using distance functions
For the purposes of the course, please submit your final papers by December 8th, thank you!!
Nov 13, 2023 Class Recording
Research is about sharing. Sometimes, that requires sharing how.
Nov 20, 2023 Class Recording
Nov 27, 2023 Class Recording
We have about 10 minutes per presentation, plus 5 minutes for questions & discussion!
Thank you everyone for a wonderful semester!
Recordings of the weekly sessions will be here: