devlog #02 - Implementing Implicit Neural Networks
Last updated: 2024-04-26 ⌛
I recently got the opportunity for one of my classes to implement any ML model I’d like and write a scientific paper on it. Since I’m very into research, I’d thought about just going ahead and getting to eventually publish it, not just waste my time and use it only once for this class.
Well, as it seems, the model I chose was something I was planning for some time to look into, namely a specific kind of neural layer, called implicit neural layer. As it seems, these layers are able to solve mostly the same kind of tasks that simple deep linear, memory or convolution layers could also solve, such as language classification, a simple regression or classification and even image processing tasks.
What is an Implicit Layer after all?
The biggest difference between a vanilla linear layer and an implicit layer is the fact that implicit layers converge to an equilibrium point, by solving for the roots of a given equation. That is, an implicit layer looks something like this:
\[\mathbf{y} = \mathbf{W} \cdot \mathbf{x} + \mathbf{b} + \mathbf{f}(\mathbf{W}, \mathbf{x}, \mathbf{b})\]Where \(\mathbf{f}\) is a solver for the roots of a given equation, such as a polynomial or a system of equations (or a linear layer in the case of a neural network). This is a very interesting concept, as it allows for a more flexible and expressive layer, that can solve for more complex tasks than a simple linear layer.