2021
DOI: 10.1007/978-3-030-72914-1_2
|View full text |Cite
|
Sign up to set email alerts
|

Network Bending: Expressive Manipulation of Deep Generative Models

Abstract: We introduce a new framework for manipulating and interacting with deep generative models that we call network bending. We present a comprehensive set of deterministic transformations that can be inserted as distinct layers into the computational graph of a trained generative neural network and applied during inference. In addition, we present a novel algorithm for analysing the deep generative model and clustering features based on their spatial activation maps. This allows features to be grouped together bas… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
14
0

Year Published

2021
2021
2023
2023

Publication Types

Select...
3
3

Relationship

1
5

Authors

Journals

citations
Cited by 9 publications
(14 citation statements)
references
References 34 publications
(64 reference statements)
0
14
0
Order By: Relevance
“…In this paper, we have demonstrated our network bending framework in both the image and audio domains. For the image domain we have used StyleGAN2 [ 4 ], the state of the art generative model for unconditional image generation, in the audio domain we have built our own custom generative model to demonstrate how the same principles of clustering features and applying transformations to clustered features first presented in [ 1 ] can be applied directly to another domain. The generative model for audio we have presented is building on a much smaller body of research, and has more room for improvement in terms of the fidelity of the generated outputs, however it is still adequate and demonstrates that our clustering algorithm is capable of discovering semantically meaningful components of the signal ( Figure 5 ).…”
Section: Discussionmentioning
confidence: 99%
See 2 more Smart Citations
“…In this paper, we have demonstrated our network bending framework in both the image and audio domains. For the image domain we have used StyleGAN2 [ 4 ], the state of the art generative model for unconditional image generation, in the audio domain we have built our own custom generative model to demonstrate how the same principles of clustering features and applying transformations to clustered features first presented in [ 1 ] can be applied directly to another domain. The generative model for audio we have presented is building on a much smaller body of research, and has more room for improvement in terms of the fidelity of the generated outputs, however it is still adequate and demonstrates that our clustering algorithm is capable of discovering semantically meaningful components of the signal ( Figure 5 ).…”
Section: Discussionmentioning
confidence: 99%
“…This process is one that could be particularly useful for music production, where an artist may want to create multiple variations of recordings they have created, that can later be layered into a music composition. An alternative use-case of this process used in the image domain is given in [ 1 ], where the chaining of multiple stochastic layers was used in the production of a series of five EP (extended play record) artworks that shared a common aesthetic theme.…”
Section: Manipulation Pipelinementioning
confidence: 99%
See 1 more Smart Citation
“…Interestingly, [29] showed that one can perform arithmetic in the latent space that affects predictable changes in image space. Since these works, a host of methods have been proposed to explore the latent structure in these generators by imposing structure at training-time [4,24] or more recently in the pre-trained generators themselves [1,12,31,32,34,36]. However, of the approaches that decompose the intermediate features directly (such as [12]), a linear decomposition is applied-where we argue a multilinear one can be more suitable in providing an ability to locate different categories of transformation.…”
Section: Related Workmentioning
confidence: 99%
“…Liu et al [16] propose a GAN that involves semantic conditional information of the input by embedding facial attribute vectors in both the generator and discriminator, so that the model could be guided to output elderly face images with attributes faithful to each corresponding input. Broad et al [17] introduce network bending model that allows for the direct manipulation of semantically meaningful aspects of the generative process. In exploring the limit of how far human expressions can be captured, in this article we have train GANs using a collection of portraits of detained individuals, portraits of dead people who died of violent causes and people whose portraits were taken during an orgasm.…”
Section: Generative Adversarial Networkmentioning
confidence: 99%