{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "# Algorithmic differentiation using dual numbers" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Consider a function $y(x) = \\cos(x^2)$. " ] }, { "cell_type": "code", "execution_count": 1, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "y (generic function with 1 method)" ] }, "execution_count": 1, "metadata": {}, "output_type": "execute_result" } ], "source": [ "y(x) = cos(x^2)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "It derivative is trivially $\\frac{\\mathrm d y}{\\mathrm d x} = -2x\\sin(x^2)$. " ] }, { "cell_type": "code", "execution_count": 2, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "dydx (generic function with 1 method)" ] }, "execution_count": 2, "metadata": {}, "output_type": "execute_result" } ], "source": [ "dydx(x) = -2*x*sin(x^2)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Let's now evaluate this result at a particular value of $x$, say" ] }, { "cell_type": "code", "execution_count": 3, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "2" ] }, "execution_count": 3, "metadata": {}, "output_type": "execute_result" } ], "source": [ "x = 2" ] }, { "cell_type": "code", "execution_count": 4, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "3.027209981231713" ] }, "execution_count": 4, "metadata": {}, "output_type": "execute_result" } ], "source": [ "dydx(x)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "But now let'd find the result using algorithmic differentiation (AD)." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Building an AD functionality from the scratch" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "First, we are going to develop our own class (actually `struct`) in Julia for [dual numbers](https://en.wikipedia.org/wiki/Dual_number). To give the credit, the code below is inspired by a code from the recommendable introductory book [Algorithms for Optimization](https://mitpress.mit.edu/books/algorithms-optimization) Mykel J. Kochenderfer and Tim A. Wheeler (they even provide [Jupyter Notebooks](https://github.com/sisl/algforopt-notebooks)). " ] }, { "cell_type": "code", "execution_count": 5, "metadata": {}, "outputs": [], "source": [ "struct Dual\n", " v # the VALUE part\n", " d # the DERIVATIVE part\n", "end" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Now we need to overload the involved basic operations such as `addition` and `multiplication` of two dual numbers, `multiplication by a scalar`, `squaring` and finding the value of `cosine` function." ] }, { "cell_type": "code", "execution_count": 6, "metadata": {}, "outputs": [], "source": [ "Base.:+(a::Dual,b::Dual) = Dual(a.v+b.v,a.d+b.d)\n", "Base.:*(a::Dual,b::Dual) = Dual(a.v*b.v,a.d*b.v+b.d*a.v)\n", "Base.:*(a::Number,b::Dual) = Dual(a*b.v,a*b.d)\n", "Base.:^(a::Dual,b::Int) = Dual(a.v^b,b*a.v^(b-1)*a.d)\n", "Base.:cos(a::Dual) = Dual(cos(a.v),-sin(a.v)*a.d)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Let's now check the functionality of the individual functions" ] }, { "cell_type": "code", "execution_count": 7, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "Dual(2, 1)" ] }, "execution_count": 7, "metadata": {}, "output_type": "execute_result" } ], "source": [ "X = Dual(x,1)" ] }, { "cell_type": "code", "execution_count": 8, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "Dual(3, 0)" ] }, "execution_count": 8, "metadata": {}, "output_type": "execute_result" } ], "source": [ "Y = Dual(3,0)" ] }, { "cell_type": "code", "execution_count": 9, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "Dual(6, 3)" ] }, "execution_count": 9, "metadata": {}, "output_type": "execute_result" } ], "source": [ "Y*X" ] }, { "cell_type": "code", "execution_count": 10, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "Dual(4, 4)" ] }, "execution_count": 10, "metadata": {}, "output_type": "execute_result" } ], "source": [ "X^2" ] }, { "cell_type": "code", "execution_count": 11, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "Dual(6, 3)" ] }, "execution_count": 11, "metadata": {}, "output_type": "execute_result" } ], "source": [ "3*X" ] }, { "cell_type": "code", "execution_count": 12, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "Dual(-0.4161468365471424, -0.9092974268256817)" ] }, "execution_count": 12, "metadata": {}, "output_type": "execute_result" } ], "source": [ "cos(X)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Finally, let's use the new functionality to compute the derivative of the assigned function $\\cos(x^2)$" ] }, { "cell_type": "code", "execution_count": 13, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "Dual(-0.6536436208636119, 3.027209981231713)" ] }, "execution_count": 13, "metadata": {}, "output_type": "execute_result" } ], "source": [ "cos(X^2)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Using a dedicated library (ForwardDiff.jl)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "In practice, you will hardly feel a need to implement your own library for algorithmic differentiation. Instead, you may want to use one of those avaialable ones, such as [ForwardDiff.jl](https://github.com/JuliaDiff/ForwardDiff.jl)." ] }, { "cell_type": "code", "execution_count": 14, "metadata": {}, "outputs": [ { "name": "stderr", "output_type": "stream", "text": [ "┌ Info: Precompiling ForwardDiff [f6369f11-7733-5829-9624-2563aa707210]\n", "└ @ Base loading.jl:1278\n" ] } ], "source": [ "using ForwardDiff" ] }, { "cell_type": "code", "execution_count": 15, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "Dual{Nothing}(2,1)" ] }, "execution_count": 15, "metadata": {}, "output_type": "execute_result" } ], "source": [ "X = ForwardDiff.Dual(x,1)" ] }, { "cell_type": "code", "execution_count": 16, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "Dual{Nothing}(-0.6536436208636119,3.027209981231713)" ] }, "execution_count": 16, "metadata": {}, "output_type": "execute_result" } ], "source": [ "Y = cos(X^2)" ] }, { "cell_type": "code", "execution_count": 17, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "-0.6536436208636119" ] }, "execution_count": 17, "metadata": {}, "output_type": "execute_result" } ], "source": [ "Y.value" ] }, { "cell_type": "code", "execution_count": 18, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "1-element ForwardDiff.Partials{1,Float64}:\n", " 3.027209981231713" ] }, "execution_count": 18, "metadata": {}, "output_type": "execute_result" } ], "source": [ "Y.partials" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "But strictly speaking, we wouldn't even bother to work with dual numbers and ask for computing the derivative directly" ] }, { "cell_type": "code", "execution_count": 19, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "3.027209981231713" ] }, "execution_count": 19, "metadata": {}, "output_type": "execute_result" } ], "source": [ "ForwardDiff.derivative(y,x)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Compare with the exact (obtained from a symbolic expression) version" ] }, { "cell_type": "code", "execution_count": 20, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "3.027209981231713" ] }, "execution_count": 20, "metadata": {}, "output_type": "execute_result" } ], "source": [ "dydx(x)" ] } ], "metadata": { "@webio": { "lastCommId": null, "lastKernelId": null }, "kernelspec": { "display_name": "Julia 1.5.3", "language": "julia", "name": "julia-1.5" }, "language_info": { "file_extension": ".jl", "mimetype": "application/julia", "name": "julia", "version": "1.5.3" } }, "nbformat": 4, "nbformat_minor": 2 }