Encoding Electromagnetic Transformation Laws for Dimensional Reduction

Electromagnetic phenomena are mathematically described by solutions of boundary value problems. For exploiting symmetries of these boundary value problems in a way that is offered by techniques of dimensional reduction, it needs to be justified that the derivative in symmetry direction is constant or even vanishing. A generalized notion of symmetry can be defined with different directions at every point in space, as long as it is possible to exhibit unidirectional symmetry in some coordinate representation. This can be achieved, e.g., when the symmetry direction is given by the direct construction out of a unidirectional symmetry via a coordinate transformation which poses a demand on the boundary value problem. Coordinate independent formulations of boundary value problems do exist but turning that theory into practice demands a pedantic process of backtranslation to the computational notions. This becomes even more challenging when multiple chained transformations are necessary for propagating a symmetry. We try to fill this gap and present the more general, isolated problems of that translation. Within this contribution, the partial derivative and the corresponding chain rule for multivariate calculus are investigated with respect to their encodability in computational terms. We target the layer above univariate calculus, but below tensor calculus.


Introduction
There is a variety of different formulas for the transformation of vector components of fields and fluxes in classical electromagnetism. When changing the coordinate-system the vector components need to be transformed because vector components quantify directions induced by the coordinate-system. This results in a different matrix-transformation scheme, depending on the physical meaning of the vectors in question. Different transformation properties of the objects considered in electromagnetic theories have been known for a long time 8;22 . They can be systematically formulated within tensor calculus at the cost of using antisymmetric tensors. Representing electromagnetic objects with antisymmetric tensors leads to a high amount of combinatorics in tensor calculus, especially when resolving permutations. The theory of differential forms provides a formalism to abstract over that. electromagnetic boundary value problem can be posed with the help of an observer structure 18 within the theory of differential forms. For a differentiable manifold 1 M , a smooth nonzero vector field T on M and a smooth one form τ on M such that τ (T ) = 1, the pair (T, τ ) is called an observer structure. Using this observer structure (T, τ ), the differential operators d τ and L T can be established.
A generic boundary value problem over one domain can be transformed to another domain by transforming the involved differential forms. Boundary value problems are regarded equivalent 18 if they can be transformed into each other in that way. If a boundary value problem is suitable for using techniques of dimensional reduction, then these techniques can be applied to all equivalent boundary value problems. For an observer structure (τ, T ) on a manifold M , an observer structure (γ, Γ) on a manifold N , on these entities a transformation F : N → M and on the differential forms the induced transformation F * , the two formulations of Fig.  1 pose the same boundary value problem. There is a very regular pattern present in these equations stating what needs to be done to transform a boundary value problem: the differential forms of the original boundary value problem have to be transformed with the pullback F * to appear in the transformed boundary value problem. An introduction into the calculus on manifolds and an introduction of differential forms can be found in the literature 19 .
On a machine, computations for solving a boundary value problem operate on number data -the numbers that are stored within the machine's memory. In the current formulation it might not be that obvious anymore how to convert the original number data into the number data for the transformed boundary value prob-1 see Sec. 1.1 lem in terms of actual computations. A confident implementation of a program benefits from an obvious description of the computation. Therefore multiple formulations complement each other: for the computation, low-level matrix-operations and index-operations can directly be executed by the machine, but for deriving the computation, only the high level differential forms statements can be overviewed. We are convinced, that high level abstractions as in Fig. 1 pay off in the most beneficial way only, when stacked on top of a layer providing a) a good abstraction to provide coordinate transformation rules in terms of matrix-based or just general computation schemes for a given tensorial formulation and b) a good abstraction to incorporate combinatorial notions, especially the enumeration of permutations, which enables the reasoning on a level of differential forms to be automatically transferred into a tensorial representation.
The purpose of an implementation is to put the machine into a state that is most efficient for processing all necessary computations of a numerical scheme which solves the boundary value problem. Abstractions help in organizing the implementation but should not prevent to use the machine in it's most efficient way. Therefore most abstractions are usually stripped before a cost-intensive computational task is started. They should only allow to produce an efficient computational scheme on spot in some form that is available on the machine: matrix or parallel or other kinds of efficient computational primitives. It is important to emphasize that the corresponding raw number data does not need to change for every transformation process.
For the first part a), i.e. the generation of transformation rules, in this paper, we show a way to realize such a layer which is independent of the actual function representation. The second part b) is motivated in Sec. 2.5 and not treated in this paper.
The rest of this paper is organized as follows: Since this is an interdisciplinary topic, we will give some necessary context for readers from different domains. This context is tailored to an implementation on a machine. In Sec. 1.1 we introduce the notion of the frame bundle and associated bundles to the frame bundle from bundle theory in order to define geometric quantities and give the general transformation rule for (p, q)-tensor ωdensities. In Sec. 1.2 we introduce shape functions and degrees of freedom of the finite element method in the context of differential forms. An introduction to the untyped λ-calculus is given in Sec. 1.3. A definition of the partial derivative in terms of a univariate derivative is given in Sec. 2. In Sec. 3 the untyped lambda calculus is augmented with axioms for a typed variant for the purpose of expressing the calculations of the previous section. In Sec. 4 we give a guideline on how this augmented lambda calculus can be applied in a software project. Figure 2: Objects involved in a covariant treatment of associated fibre bundle. All arrows in this diagram denote functions. The name of the function is written next to its arrow. At the beginning of an arrow is the domain space and at the end of an arrow is the co-domain space of the corresponding function. A product of spaces is denoted by × and similarly the parallel composition of two functions operating on these product spaces is also denoted with ×.

Electromagnetical Context
Electromagnetic theory is concerned about the spatial and temporal relation of different physical quantities such as potentials, forces, fluxes and densities 21 . These are often grasped with respect to a coordinate system and its coordinate-induced directions. A base for all directions at a point is called a frame 12 . The coordinate system is called a chart and it is modeled as a continuous mapping from points p of a topological space to their coordinates within R n . A collection of such systems is called an atlas and if the whole space can be covered by overlapping charts into R n for the same n it is called locally euclidean. A topological manifold is defined by additionally demanding the Hausdorff 18 property and the space being second countable 18 . If chart transition functions of an atlas are arbitrarily often differentiable, the atlas is called smooth. Two atlases over M are smoothly equivalent when their union is also a smooth atlas over M . An equivalence class of smoothly equivalent atlases over M is called a smooth structure. A topological manifold M endowed with a smooth structure is called a smooth manifold. Analogously, regarding only k-times differentiable chart transition functions for k > 0 leads to a differentiable structure. A topological manifold endowed with a differentiable structure is called a differentiable manifold 18 . On the differentiable manifold, we defined the observer structure that was necessary to establish the differential operators for the boundary value problem in the introduction. A far-reaching introduction on this topic can be found in the literature 5 12 .
Having two manifolds, one called total space E and one called base space M , and a continuous surjective function π : E → M , the tuple (E, π, M ) is called a bundle of manifolds. Here the preimage preim π (p) of a point p ∈ M with respect to π is called fibre at the point p, denoted by F p . If fibres of all points are homeomorphic to some manifold F , the bundle is called a fibre bundle with typical fibre F . Globally over the manifold, tensor fields, vector fields and differential forms are considered. They are modeled as sections of some fibre bundle where locally at a point tensors, vectors and co-vectors of some algebra are considered. One specifically important algebra for that purpose is the exterior algebra of local covectors from global differential forms.
When answering "why exterior differential forms" are useful as a formalism for the modeling of electromagnetic laws, some authors 12 justify this with "the alternating algebraic structure of integrands that gave rise to the development of exterior algebra and calculus which is becoming more and more recognized as a powerful tool in mathematical physics" 2 . Further, we will make use of the generalization of a tensor, the geometric quantity, being "defined by the action of the general linear group on a certain set of elements" 3 . Examples are tensor-valued differential forms and twisted tensors.
Electromagnetism as a physical effect does not depend on a chosen coordinate system, which -as a propertyis called general covariance. In our new wording, a geometric quantity at some point should not depend on a chosen frame. Now, a standard mathematical conjuring trick in order to avoid an arbitrary choice, is to attach that choice to the objects in question -to attach the chosen frame to the quantity in our case.
The theory of associated fibre bundles can describe different kinds of fibre bundles that fulfill an equivalence relation. This equivalence relation is expressed by the means of a Lie group G with respect to some G-bundle.
A G-bundle will be introduced straightaway.
For a manifold M , a cover {U α } of M by open sets, a vector space V and a group representation ρ : is the general linear group over V , it is possible to obtain 5 a vector bundle (E, π, M ) using transition functions g αβ : U α ∩U β → G. This is only possible when the compatibility conditions 5 are fulfilled. It is done 4 by partitioning the disjoint union ∪ α (U α × V ) with the equivalence relation A vector bundle obtained in this way is called a G-bundle and F is the standard fibre 5 . When V = G, the G-bundle is called principal 4 . The frame bundle LM is a principal Gbundle where G is the general linear group.
The fibre bundle of frames over a smooth manifold M is called frame bundle and denoted by LM . At a point p for some chosen frame e ∈ L p M and some geometric quantity f ∈ F p from the fibre F p , we regard the tuple (e, f ) as one representation of a geometric quantity at that point. But one could choose another frame e which is done by choosing another chart, or coordinate system, and then the new frame is related ∼ to the old one by means of the Jacobian J at that point p: Here is a right action on the frames and is a left action on the fibre for all, so-called (p, q)-tensor ω-densities. Both definitions make use of a sum convention, summing over all equal indices.
The Jacobian J is an element of the general linear group from the definition of a geometric quantity. That makes (p, q)-tensor ω-densities a special case of geometric quantities. Not one tuple (e, f ) with one chosen frame, but the equivalence class of all tuples that can be related ∼ to each other with some J makes a value of a 4 adapted from p. 212-214, Baez 5 geometric quantity at a point p. This is expressed in the inner part of Fig. 2: by taking the space LM × F but partitioning it (LM × F )/G with respect to a group G. This group is the general linear group in our case. The new fibre bundle (LM × F )/G is called to be associated to the principal G-bundle LM .
For the implementation on a machine, we are most likely not to work with a point p ∈ M but with its coordinates xyz(p) ∈ R 3 . These coordinates are with respect to some chart xyz : M → R 3 where R 3 is denoted as R xyz in Fig. 2. That chart xyz induces directions and in particular one concrete frame ∂ xyz (R xyz ) at each point p. To this frame, there corresponds exactly one number data from the fibre F so as programmers we consider the chart representation xyz σ of a section σ in the implementation. When changing charts, the chart transition map Usually one only represents the blue bits of Fig. 2as data and the green bits of Fig. 2as computations in an implementation. The remaining black bits might be treated in an opaque way. This is a technique in creating programming interfaces where objects are exposed via references which are of a defined reference type. That reference is called opaque when referring to unexposed or even undefined data while the representation of the reference itself is known 5 . Even though the black bits are not themselves represented in an implementation as number data, their rules of operation are a candidate for entering an implementation as rules of opaque references. Opaque references can be used to restrict the usage of operations on number data to valid cases. The amount and flexibility of expressible restrictions for that purpose is a property of the targeted programming language. In Sec. 3 we make use of a dependently typed 16 programming language that offers high flexibility in expressing restrictions to increase confidence in our approach. In our performance critical code we make use of a deterministic just-in-time-compiled programming language that offers partial recompilation and high flexibility for code-generation to help putting the machine into its most efficient state for a computation.
The theory of associated fibre bundles of the frame bundle LM provides a notion of (p, q)-tensor ω-density. This notion is sufficient to express all electromagnetic quantities of interest and they share a single transformation law 1. The transformation F * of Fig. 1 used to transform an electromagnetic boundary value problem follows the rules of Φ from Fig. 2: having a single explicit definition for all the various (p, q, ω) transformations, makes this theory very promising as a starting point for an implementation in a software project. Furthermore, it is observable that such software will heavily rely on the correct evaluation of Jacobians at the right coordinates of possibly chained transformations. That is the reason why we are so interested in a very solid foundation of encoding partial derivatives and their chain rule in Sec. 2 and 3.

Software Context
There is much within computational electromagnetism that counts as software. This community has a history of incorporating guidance from mathematical structure of the electromagnetic theory into the development of consistent and stable numerical methods.
When speaking of numerical software, this paper focuses on the finite element method which is a Galerkin method that can be expressed in terms of the Finite Element Exterior Calculus 3 . There are a lot of common mistakes leading to wrong solutions of the finite element method. The reason of failure is often not that obvious 3 . This non-obviousness is mirrored in an extensive development of Galerkin methods, and in particular the finite element method, within past decades.
For a numerical consideration, i.e., for the purpose of establishing proven guarantees of certain errors, a numerical method is abstractly modeled using an abstract Hilbert space V . It is assumed that the numerical problem can be expressed using a bounded bilinear form B : V × V → R and a bounded linear form F : V → R as The problem is called well-posed if an unique solution u exists and the solution mapping F → u is bounded again. Using that formulation, a Galerkin method is characterized by a family of finite dimensional, normed spaces V h indexed by parameter h that in some sense 6 approximate V . The Galerkin method for that family of spaces 6 The spaces V h do not necessarily have to be subspaces of V .
(3) It is desired to prove that the property of «V h approximating in some sense V as h advances» is conveyed to «u h approximating in some sense u as h advances». The finite element method is a Galerkin method where elements of the basis of V h have finite support, i.e. they are nonzero only on a small part of the considered domain.
A construction of bases for a family V h of spaces can be proven to be consistent and stable when used in a Galerkin method. The Finite Element Exterior Calculus provides constructions of classes of finite element bases whose Galerkin methods were proven to be consistent and stable 2 3 . This was done utilizing notions from differential geometry and algebraic topology in order to develop methods for error analysis. It is necessary to do this within a functional analytic setting, because a notion of approximating and error and therefore consistency and stability of a numerical method, do ultimately origin here.
We consider the explicit construction 2 of two families of explicit local bases. Here the approach was "not trying to find hierarchical bases, but rather [...] generalize the explicit Bernstein basis" 2 . Where it is easy to give a spanning set of polynomials with meeting requirements, it is much harder 2 to provide a basis of linearly independent polynomials.
A key insight is to decompose this construction of base elements and define the polynomial base in terms of smaller shape functions. Multiple adjacent of those shape functions are recombined into one base element by enforcing proper interelement continuity conditions 2 . This approach is sometimes called an assembly 2 7 . With its interelement continuity conditions, this process is the reason why multiple shape functions from adjacent pieces of a domain share the same degree of freedom. The presented 2 assembly process for the construction of basis functions, is "a straightforward consequence of the geometric decomposition of the finite element spaces" 2 .
For both families of shape function spaces for each simplex T and each subsimplex, sometimes called face, f with r ≥ 1 and 0 ≤ k ≤ d and d = dim f ≥ k there is a shape function space and there are degrees of freedom. One is the shape function space of polynomial differential forms P r Λ k (T ) with corresponding degrees of and ∧ is the exterior product. For differential forms, the trace operation tr is the pullback ι * f,T of the inclusion ι f,T : f → T The other one is the shape function space of polynomial differential forms P − r Λ k (T ) with corresponding degrees of freedom given by The construction of the base elements for a shape function space is "somehow a complicated business" 8 and provided 2 in terms of: • a simplicial complex • taking the set of subsimplices of a given simplex • restriction maps from a simplex to one of it's subsimplices and inclusion maps the other way around • barycentric coordinates • the exterior derivative of barycentric coordinates • piecewise polynomial differential forms • the pullback of polynomial differential forms along a restriction map • multi-indices • the index-set associated to a face of the simplicial complex • the set of all order preserving maps of indices • taking the support of a multi-index • taking the range of an order preserving map.
For every item on this list, we will probably have some correspondence within an implementation for a machine. A simplicial complex is usually given by a mesh. It is mostly stored in two separate parts. One part is an abstract simplicial complex consisting just of the combinatorial information which is sometimes called the mesh topology. The other part is additional data which can be used to create homeomorphisms from the standard simplices to the given ones. This data forms a parametrization and provides barycentric coordinates. In the case of a simplicial complex, this data might just contain vertex coordinates, but it becomes more interesting for curved cells.
Multi-indices and order-preserving index maps are rooted in the combinatorial domain. Their representation in an implementation might be exploited in a clever way. The most intriguing correspondence we think is the one of polynomial differential forms and their pullbacks. These can be resolved within a pen and paper computation 9 and, then, the resulting polynomials can be implemented very carefully. But it also seems reasonable to formulate the whole construction of a shape function element within a programming language. One of our goals within this paper is investigating how to do so in an appropriate way.
This approach essentially lifts the implementation to a meta-level. Previously, as programmers, we were seeking an implementation to perform a numerical computation in the most efficient way for a given machine. Now, we have to program an implementation that is able to produce another, more concrete, implementation which in turn is able to perform the numerical computation in the most efficient way for a given machine. The efficiency of this meta-implementation is usually not critical for the efficiency of the resulting implementation.
One obvious technique is to generate source code of an implementation with the meta-implementation. The programming language of the meta-implementation does not need to be the same as the programming language for the targeted implementation. Most programming languages offer meta-programming constructs to generate computations and data structures and the expression of constraints that are to be checked during this generation. These constraints are used to restrict the argument's domain of a meta-computation. The templating system 13 of the C++ programming language is a very popular choice in the community of computational electromagnetism 17 . This might be partly because it allows to use the same language for the implementation and the meta-implementation. While this choice of programming language helps the programmer in putting the machine into its most efficient state, it offers limited flexibility in expressing logical constraints for the valid application of meta-computations. Therefore, the expression of algebraic rules from a construction of finite element bases might only be partially incorporated. When seeking for confidence, it is critical to be able to express all rules that are needed to be confident of. These rules are expressed in the programming-language of the meta-implementation in 9 A limited amount is shown in a table from the original paper 2 and the various families of bases are implemented within the FENICS 1 project. Reproducing the bases from their paper required us some amount of bookkeeping. order to have them checked automatically. We might even claim that the usefulness of checking rules critically depends on the completeness, or coverage, of the rules regarding all possible cases. Putting it in another way: we claim that • achieving high efficiency is the biggest challenge when programming the implementation, whereas • achieving high validity is the biggest challenge when programming the meta-implementation.
That is why we advocate the use of a programming language with a checking mechanism for dependently typed 16 expressions to formulate constaints. For the meta-program this offers a chance to express all algebraic rules completely. This is relevant because metaimplementation techniques seem to become more and more unavoidable in modern high performance computing.

Computational Context
Within this contribution, the partial derivative and its corresponding chain rule for multivariate functions will be investigated with respect to their encodability in computational terms. A functional analytic setting is very powerful for an analysis of problems related to partial differential equations. In this paper, we will treat the operation of taking the derivative of a univariate function in a more synthetic way. The derivative operation will be embedded into a more general context of computation where some basic properties become assumptions of that embedded derivative operation.
In this paper, the understanding of computational terms is backed by lambda-calculus (λ-calculus) which serves as a model, or definition, of effectively calculable functions. That calculus was originally developed by A. Church in 1936 10 and we will follow a modern treatise 6 of the resulting findings. We will take a type free 10 λ-calculus that is extended in Sec. 3 to a typed variant. The type-free λ-calculus is constituted by a set Λ of λ-terms built up from an infinite set of variables V = {v, v , v , ...} using application and function abstraction: We choose the convention to suppress the outermost parenthesis in (λx M ) when it is unambiguous and to add a separating dot inbetween x and M resulting in λx . M . On these terms, an operation of substituting N for the free occurences of x in M can be defined and is denoted by Furthermore there are binary relations for η-reduction, α-conversion and β-reduction reading from left to right: We will regard two terms as computationally equivalent if they can be related to each other in these ways. Therefore a symmetric ≡-symbol is already present here, although η-reduction, α-conversion and β-reduction are defined as operations from the left hand side to the right hand side in the previous listing. The usage of λ-calculus will be elaborated in more detail in Sec. 2 and put in a more rigorous setting in Sec. 3.
An introductory survey, reaching out to the techniques 11 used in Sec. 3, can be found in the literature 23 as the propositions as types paradigm. This paradigm pictures the development from λ-calculus to the proofs-as-programs and propositions-as-types interpretation through one of the most prominent developments within theoretical computer science: the Curry-Howard correspondence.
The rules of the black bits of Fig. 2, mentioned in Sec. 1.1, lead to constraints for restricting an argument's domain of a meta-computation, mentioned in Sec. 1.2. These rules can be formulated as propositions of objects within the theory of differential forms. Analogously, the constraints for restricting an argument's domain of a meta-computation can be formulated as propositions of an object in the programming language. A compiled programming language is able to perform compile-time checks based on type-equations formulated in the programming language's type system. Expressing a proposition as type-equations, and having these type-equations checked, or affirmed, conveys the affirmation of the type-equations to an affirmation of the proposition. Therefore, using the type system of a programming language for the purpose of establishing validity of a meta-implementation de-pends on the translatability of propositions into equivalent type equations -equivalent with respect to affirmation. This orientation towards translatability and high validity differs from the algorithmically-oriented approach of computer algebra systems.
To support the theory in Sec. 3 we have used a programming language that is developed precisely for the purpose of establishing a translatability of propositions into equivalent type equations. This choice seems to offer the best chance of being able to express all algebraic rules of a construction of finite element shape functions completely. But that is an outlook. In this paper we propose an encoding of the partial derivative in λ-calculus as a foundation of a system of algebraic rules. That foundation is tailored towards an application in numerical methods, especially the finite element method.

Varying Syntax and Semantics
We introduced lambda calculus in the standard notation, which is also the notation we use in our implementation later-on. But for this modeling part, we switch to a barred-arrow → notation since it resembles the standard notation of the electromagnetic theory.
References we give here for notation, might be a bit picked, and it is, of course, a matter of taste. But it is this notation that illustrates day-to-day problems when working within the electromagnetic theory.
In one reference 7 from the domain of computational electromagnetism, A. Bossavit argues 12 about a notation for functions. The given argument is to advertise using an arrow-symbol 13 in order to better emphasize a distinction of functions and expressions. It is recommended to denote a function f of the expression Where it is stressed that using just = in the second case, could be interpreted as an equality instead of a definition. This is accompanied with an example of a differential operator, helping to resolve some "ambiguity as to which gradient, with respect to x or to y, we mean", making x the parameter and y the variable, both of which are vectors: Those differential operators act on function objects, and their notation might be borrowed from the notation of higher order functions in programming. A reference to programming, and especially to λ-calculus is already drawn in that reference.
In the same way that higher order functions, or functional programming in general, are known to have some steep learning curve to overcome, similar applies here. This might be, why usually in engineering a codomainfocused style of notation, as in (7), is preferred.
We think that some confusion arises by taking expressions and not functions as the dominant objects in calculus. For instance, the Mathematica 24 programming language follows an expression focused approach.
Speaking about expressions, coincidentally also another example 15 is given by P. Martin-Löf, although he was not up for differential calculus and used it as a mere example for forms of expressions: and introduces the i-th partial derivative of f at a as D i f (a). He alludes, that the partial derivative is the ordinary derivative of a certain function, e.g. if g(x) = f (a 1 , ..., x, ..., a n ) then Notation is briefly discussed and it is mentioned that resolves the usage of a notation like Another issue is framed by E. Tonti who also dedicates a chapter 21 to revise terminology. There, many kinds of equality are illuminated. He proposes five different such equalities to be suitable for the purpose of explaining electrodynamics instead of just using a single = for all of them: This issue could also summarized by arguing that "the fragment of mathematical symbolese available to most calculus students has only one verb, '=' " 20 . "That's why students use it when they're in need of a verb." 20 In general, there is "a list of different ways of thinking about or conceiving of the derivative" 20 of a function instead of a single way to do so. These all make their appearance at some point when studying electromagnetism.
We might summarize that these authors propose an expressive notation for what kind of statement is expressed by =, maybe even which kind of objects it relates, and how the variables of an expression are quantified.
Rather than giving a meaning of what the univariate derivative is, we treat it synthetically and collect the few properties necessary for introducing a partial derivative on top. In order to resolve the various notations, we have chosen to resemble λ-calculus.
To support multiple interpretations, we chose to explain our usage of λ-calculus with a changed notation.
From our experience this better resembles day-today notation in computational electromagnetism but is close enough to follow notation of a formalization later-on in Sec. 3. This choice is made to support readers that are not immediately implementing such λcalculus but still want to gain some insights about the partial derivative.

Yet another notation for functions
Our aim is to connect more high level theories, such as tensor calculus and differential forms to more low level theories, such as multivariate calculus and λ-calculus. With tensors and differential forms it is possible in a tractable way to express sound notions of invariant properties and differentials. In multivariate calculus and λ-calculus it is possible in a tractable way to express sound notions of an univariate derivative and computations. After such a connection is made, representations and implementations that arguably behave in a way respecting these notions need to be given. Doing so should contribute to the discussion about how higher level representations of physical entities can be encoded in a program.
We start with the assumption of a given univariate derivative operation that for a given univariate function representation f can compute the univariate function representation of the derivative of that function f . For the computational description, we make use of an untyped, simplified λ-calculus as introduced in Sec 1.3. Instead of λx.f x, we denote function abstraction by x → f (x) to better resemble day-to-day notation. We emphasize that only the following rules are used and it does not matter if you do not know λ-calculus yet, if you can familiarize yourself with these four computational equivalences (9)(10)(11)(12) that are already in use in engineering mathematics and denoted by ≡ here. The meaning of these equivalences is explained in the following. They display as: The intention of stating these rules is to be able to distinguish and name them. Our application of the η-equivalence on univariate functions (9) states, that a function f and the λ-abstraction 14 immediately applying the argument x → f (x) are computationally equivalent and therefore can be substituted against each other respecting the computation's result. The α-equivalence (10) in this case states, that it does not matter for the computation how the argument is named, of course. So every time ≡ α appears, the left hand side can be transformed in a computationally equivalent way to the right hand side by argumentrenaming and vice versa. The β-equivalence (11) expresses that an application of function y → term, i.e. the term regarded as dependent on its variable y, to the argument x is computationally equivalent to a term[y := x] where all occurrences of y are substituted for x. This is denoted by the substitution [y := x] acting on the term as a postfix operation. Lastly, not that much a rule of λ-calculus but more a definition of the composition operation •, is the rule (12).
These rules (9)(10)(11)(12) are somewhat standard rules that are most likely fulfilled in any context of computation.
In λ-calculus every function takes exactly one argument and has one result which is a perfect interpretation for univariate calculus. In computational electromagnetism, the representations of the considered objects, the electric and magnetic fields, the geometry, e.g. when given by parametrized coordinates, and coordinate transformations are expressed as multivariate functions, taking multiple arguments to multiple results 15 . Multiple arguments can be already thought of being represented as one argument with the help of the notion of a tuple, where the single arguments are separated by commas. Multiple results can be thought of as tuples in a similar manner. Yet, we choose a notation here that allows a multiple-argument-interpretation instead of tuples. It seems most familiar to the engineering community and does not pose a limitation since a multiple-argument-interpretation is translatable to a one-argument-interpretation.
That notation is motivated by the tediousness of multivariate calculus to express function application for these multiple arguments 16 . For a term we denote the expansion by term... which should be computationally equivalent to a context where the comma-separation of copies of the term substituted with every single pa- 15 We use the nomer multivariate, although it usually denotes functions taking multiple arguments to one result. Since in our case the results are not correlated to each other, and functions that give multiple uncorrelated results can be represented as a collection of these multivariate functions in the usual sense, we do not distinguish the terms here that much. 16 Our proposed variant is mostly borrowed from the parameterpack expansion which is a carefully specified notation that appeared in the standard of the C++ programming language 13 first in its 2011 version. A parameter-pack can only appear in a meta-computation expressed within the templating system of C++. This notation is implemented in all current compilers complying to that standard. rameter, or variable in our case, of a tuple is applied. If x denotes a tuple of four parameters, the expansion of the most simple term, just consisting of x itself, corresponds to Here, the tuple expansion ... captures 17 all tuples in the term x, which is just x, and expands them to x 1 , x 2 , x 3 , x 4 within the original term to produce the resulting term. The three dots are used frequently in a meta-logical manner where it is clear from the context how to continue the pattern. When it comes to an implementation, one needs to make this patternrepetition precise. In the following we make use of the three dots ... only in the sense of this kind of expansion, where the tuple is again underlined to highlight its meaning as a placeholder. The unexpanded term is denoted in an m-way as computationally equivalent ≡ m to the expanded one.
The reason for introducing this particular notation is that it supports us in making precise arguments about multivariate functions in the previous sense. Our most important application is to express multivariate function application. E.g. suppose g is a multivariate function in R 2 → R 3 such that it can be decomposed into functions g 1 , g 2 and g 3 in R 2 → R, then we have two computationally equivalent terms with the nested use of the operation of tuple-expansion ... : Here, the green tuple expansion ... captures the green tuple g where the blue tuple expansion ... captures the blue tuple x. Another use 18 is, given that γ is a multivariate function in R 1 → R 3 that can be decomposed into the functions γ 1 , γ 2 and γ 3 in R 1 → R, then we have two computationally equivalent terms with the expansion ... of multiple nested tuples γ and x: Here, the blue tuple expansion ... captures both blue tuples γ and x. Given the notion of tuples x, y and the operation of tuple-expansion ..., we can restate the previous computational equivalences (9)(10)(11)(12) in their multi-variate version (13)(14)(15)(16): (y...) → term (x...) ≡ β term (y := x)...
Note especially how expansion interacts with composition of multivariate functions in (16).
One more remark about tuples: you might have noticed that, despite underline and dots, the rules (13-16) exactly match the rules (9-12). That is not a coincidence. Indeed, we could identify a scalar, with the one-tuple of scalars and have just one generalized version of the rules. This works for all tuples, including one-tuples, and therefore also for all scalars. Also, in Sec. 3 we do regard multivariate functions as mapping tuples of scalars to tuples again 19 . Different notation and a reference to parameter-packs are just given, to support an interpretation within a language that does not identify scalars with one-tuples and may even distinguish tuples from a list of function-arguments 20 .
You are free to ignore the three dots, when targeting an interpretation where function arguments and tuples are treated the same way 21 . But as with higher order functions, tuples add a small burden on the learning curve and it is sometimes convenient to just think of a written-out version when comparing with the literature. One can test this preference by looking at f (x, y, z) and if that should be a function f , applied to the function arguments x, y and z, then the tuple-expansion notation might be a fit. But if you are comfortable with (x, y, z) being a tuple and if that tuple would be named τ your preferred notation is just f τ then you might want to ignore the dots. When programming, this choice is made by the programming language.

Encoding the partial derivative
We make use of the previously introduced equivalences to formulate what a partial derivative should be in that context. It is thought of as being the univariate derivative of a multivariate function which is regarded as a univariate function only depending on its one argument that we are taking the derivative of. That univariate regarding of a multivariate function can be made precise now: Suppose the multivariate function h is in R 3 → R. Then h is computationally equivalent in an η-way to the multivariate function (x...) → h(x...) as in (17). Just the inner term h(x...) of that new multivariate function is computationally equivalent to x 2 → h(x...) (x 2 ) in a univariate-β-way (18). To see this, for the example, we look at the expanded version (19). What happened is that the inner abstraction of x 2 is shadowing 22 the outer argument x 2 . To highlight this difference, we explicitly rename the inner x 2 into an α-equivalent function with z occuring instead (20). This in a multivariate way constitutes the substituted expansion of the tuple x, denoted as x... • 2 := z , where entry 2 is replaced with z as in (21).
This leads to the last rule of computational equivalence that we need for our considerations and it relates a multivariate function application to the use of a univariate function application: To better familiarize with it, looking forward to an implementation, we give the syntax tree of this rule in Fig.  3.
That is, finally, enough to define the partial derivative on multivariate functions f : R d → R c by the notion of the derivative on univariate functions. For a general arity and the indices j ∈ [1, c] and i ∈ [1, d] it is given as the multivariate function where f j is the projection proj j • f of the j-th result of the multivariate function f or similiarly the j-th part of the decomposition of f in the previously discussed manner. 22 In theoretical computer science this is usually realized not by shadowing, but by limiting the α-equivalence to the cases where the argument x of x → term does not occur as a free variable of the term, which is stated as x / ∈ FV(term). But shadowing exists in the most programming languages. In that definition (23) we do not use the information about how to name the argument, with respect to which we are taking the partial derivative. That is the case because partial derivatives with differently named arguments are computationally equivalent by α-equivalence: There are two remarks here to make. Firstly, the computational equivalence of the partial derivative under renaming of the argument, i.e. the α-equivalence, motivates to omit the variable name ∂ i f j . Later-on, however, in the theory of differential forms, this exact spot to give a name to the argument is often used to indicate which charts are involved in the process of coordinate transition 23 . That characteristic results from the use of function-abstraction to express the partial derivative instead of introducing a new form of expression as in the example of P. Martin-Löf given in Sec. 2.1. In this way, the definition (23) does not bind any free variables of its argument-terms.
Secondly, for a transition along f from A-coordinates to B-coordinates, i.e., where f is a function expressing the B-coordinates in terms of A-coordinates (f (a...)...) = (b(a...)...), we have ∂f j ∂a i (a...) to constitute the number in the j'th row and the i'th column of the Jacobi-matrix J f evaluated in A-coordinates at (a...). That matrix is used to transform the numbers (v B ...) that are the vector-components with respect to the B-induced basis at a point given by the same A-coordinates into the the numbers (v A ...) that are the vector-components with respect to the A-induced basis at the same physical point by matrix-vectormultiplication 24 . This scrutiny forms the foundation of a matrix-translation in terms of the Jacobi-matrix for different kinds of vectors. It is important to gain any support from encoding this logic into the notation and into the program to handle these different calculations and check them for consistency.

The Chain-Rule revised
Using just these established conditions, we will derive what it means to have a notion of a chain rule for the partial derivative, lifting the notion of the univariate chain rule to the multivariate level. The whole calculation is given in appendix A. In order to create the multivariate listing in appendix A and the corresponding one for a concrete two-variate case in appendix B, we have implemented the tuple expansion the previously introduced way.
We begin in (A1) with the partial derivative that can be represented in an implementation not carrying anymore information than written in (A1), i.e. which function f j •g it applies to with respect to which entry i or directly as the function that we encoded definitionally in (23). In the first case an implementation needs to provide a function that converts these bits of information into that encoding. In the second case we directly operate on these objects. The multivariate function (A2) again does not need more information encoded than written out there and the data structure is very similar to the one resulting from a tree-like encoding of figure  3. The expanded terms for the two-variate case where i = 1 is given by (B2) and you can follow the expanded variant in appendix B alongside this investigation.
An equivalent computation (A3) is given by the multivariate •-equivalence, applying f j to g instead of composing it with g. At this point, we make use of a linearity-property which needs to be fulfilled for a concrete realization of the univariate derivative later-on. Namely that the univariate derivative of a multiply occuring argument is given by the sum of the univariate derivatives of each occurrence. We denote this by = lin for the two-variate example given by: For our general multivariate notation, h has to be identified with h := (z...) → f g x... • i := z ... , leading to the general multivariate variant of this linearity, expressed with a summation k over a new index k: which expands in the two-variate case for i = 1 to: Note the nested substitution in the right-hand-side term of (25) now, where only the application of the k'th decomposition of g is differently applied to the x's of which just the i'th one is replaced with z. Therefore the linearity = lin justifies whether (A4) computes the same result.
The nested substitution is computationally equivalent to the composition of univariate functions containing just a single substitution as in (26) which is the needed transformation that leads to (A5).
At this point, we have encoded the sum of k different univariate derivatives of a composition of two univariate functions (A6), where k-times the univariate chain rule can be applied (A7) to lead to (A8). For the right multiplicand after transforming it in a β-way to the computationally equivalent form in (A9) it matches the definition of the partial derivative on g (A10). The left multiplicand can be turned in a •-and β-way to the computationally equivalent form (A11-A12) where the definition of the partial derivative again applies. This leads to the common form (A13) of the right hand side of the chain rule for the partial derivative of the composition of two functions f j • g, almost, but not quite: The applied calculus enforced an explicit mentioning of the abstraction (x...) → since these are function objects and only if they are applied to the same arguments, the one resulting number is equal for both sides:

Targeting Tensor Calculus
In this paper our focus is to establish the lower interface that an encoding of the chain rule of multivariate functions demands from an encoding of the univariate chain rule. It was investigated, how to define the partial derivative in computational terms. We have shown in Sec. 2.4 that this computational context is capable of deriving a chain rule for this definition. In Sec. 3 we will introduce an augmented λ-calculus based on the requirement to express a derivation of the chain rule from Sec. 2.4. What remains open for discussion is the question whether that augmented λ-calculus is suitable to express definitions and derivations from tensor calculus. It is also not obvious how the upper interface to tensor calculus should look like. This section motivates why we think that our approach is extendable to express derivations from tensor calculus.
Continuing on (A13), with the •-equivalence we have a context (A14) where it is possible to make use of a function-level multiplication ⊗ that is given by the corresponding point-wise multiplication (A15). This is a binary operation and could be precomposed with a function applying g to the left argument and the identity id to the right argument. Defining such function is in favor for having just one binary operation on the two partial derivatives (A16), making a corresponding data structure definition even more obvious. Establishing a function-level summation ⊕ makes it possible to express the chain-rule in a completely so-called pointfree 25 style (A17). The objects reasoned about in this expression should correspond (denoted by ∼ =T ) to objects of the expression (A18) of tensor calculus, where unfortunately is a decoration on indices and not to be confused with the univariate derivative. We think that based on the way of that correspondence ∼ =T the question of encoding could be answered in a tractable way. 25 i.e. a style where no arguments (x...) are present What are the objects of tensor calculus that are common to reason about in computational electromagnetism? In the appendix of his book 21 , E. Tonti collects the notions of: • tensors and pseudotensors, such as tensor densities and tensor capacities, that differ in their transformation laws on a power of the determinant of the coordinate transition function, • natural, reciprocal and physical basis vectors, leading to contravariant, covariant and physical components that are number-representations of various kinds of scalars and vectors in electromagnetic theory, and • algebraic and metric dual vectors that constitute different representations of antisymmetric tensors.
Tonti 2013 x In classical electrodynamics, the physical base is often chosen because of its property to preserve the calculation for the length of a vector. This gives a direct interpretation for the measurement of such a quantity in a cartesian system, which is very valuable in a physical interpretation. These choices are combined with constructions such as the magnetic flux tuple of numbers corresponding to the three-number representation of the magnetic flux bi-covector at a point and similar constructions. Therefore, we think that it becomes arguable to investigate the computational aspects of such a correspondence. In accordance to follow his notation, which is very well chosen to support the application in various physical theories, we give correspondences in Fig. 4.
Note especially, the choice of different symbols λ and Λ to reflect the information in which logical direction 26 the partial derivative has to be taken and the drive to name 26 The direction, i.e. from the A coordinate system to the B coordinate J undecorated coordinate system k decorated coordinate system h Figure 5: Encoding of the partial derivative used in tensor calculus the argument, x or x respectively, to remember the coordinate transition function's domain. The difference between tensor calculus and the presented formalism is that we regard objects that are functions and function compositions where tensor calculus has a notion of coordinate system. That is the key abstraction necessary to use in an implementation suitable of computing the chain-rule as a supporting layer. Consequently, we had no need to name the arguments and it is indeed not possible by α-equivalence ≡ α to encode that additional information.
Just to oppose it, we give in Fig. 4 another popular choice for denoting the partial derivative in tensor calculus J k h for λ k h and J h k for Λ h k . As mentioned before, the here should not be confused with the univariate derivative. The is a decoration on the indices k and h to represent their coordinate system belongingness. Choosing different kinds of decorations for the indices to omit giving indices to the indices is an inevitable problem when multiple coordinate systems are considered. In addition to that choice, there is the legitimate choice of the property of coordinate system belongingness being one of the index or being a property of the partial derivative object itself. The former perspective is taken in the notation we opposed which where the latter was denoted λ or Λ respectively. This state of affairs is also shown in fig. 5. An answer to that question of choice highly influences the encoding of tensor calculus expressions for the purpose of an implementation.
As promised in the title, we will show here transformation laws for the magnetic flux B and the electric field E, although the reason of this paper is not the result but the process of deriving these laws. For a clarified choice of ∼ =T , which we did not yet made in this paper, suppose that Z, A and B are given by left decorated z , undecorated a and right decorated b coordinate systems. In this notation, for clarification, the coordinate system belongingness is redundantly encoded in system or in the direction that f is defined, is meant here. To emphasize its distinction from the physical direction in space, we call it the logical direction instead. the choice of the letter, as well as in the decoration of that letter. This amounts to the habit that in the calculus of multivariate functions, just different letters are used, where in tensor calculus only different decorations are used. Then, for the two transition functions g : Z → A and f : A → B the tensor calculus expression that relates the covariant components B i j of the bi-covector of the right decorated coordinate system b to the ones B ij of the undecorated coordinate system a is given by: where free indices are highlighted in blue and bound indices, which are summed over, are highlighted in green. This translates into: As B ij should be regarded to naturally live on the undecorated coordinates a and the resulting object B i j to live on the right decorated coordinates b , a precomposition with f is necessary to obtain the B ij value at b coordinates. Although this transformation goes in the same logical direction as the functions g and f are defined, the partial derivatives of inverses of these functions appear due to the contravariant transformation property of the considered electromagnetic quantity.
Tensor calculus is concerned about the invariance properties of different quantities. Suppose that two Jacobians cancel each other out in the following way where δ i j is the Kronecker delta which is 1 for i = j and 0 otherwise. Then, it is easy to see that the transformations of the tensor components in S ij T i U j will cancel each other out. If the independent transformations of S ij , T i and U j are then we have for the composed term Here, the parentheses are present only for clarification since this tensor expression represents scalar multiplications.
In our previous example the transformation of a bicovector needs to be justified by invariance properties that are expressable within tensor calculus. The appearance of Jacobians is due to this invariance. With our proposed formalism, we can express this as a computational equivalence. If we were to have a representation for B i j and B ij then such an equivalence should be derivable at this tensor-calculus-layer. This might motivate that the statements that need to be proven, which arise in tensor calculus, are expressable in our proposed formalism.
Another example is the tensor calculus expression relating the contravariant components E i of vector E in the right decorated coordinate system b to the ones E i of the left decorated coordinate system z is given by: Here again the resulting E i should live on the right decorated coordinates b , where the original E i lives on the left decorated coordinates z . The transformation happened again in the same logical direction as the functions go, but this time we have transformed twice.
To apply the introduced partial differential of the multivariate functions, it becomes necessary to precompose with proper inverses to obtain an expression that again depends on the right decorated coordinates b .
This second example of a chained coordinate transformation makes use of the derived chain rule which is justified in the current formalism. This should make it easy in an on-top tensor calculus layer to computationally proof J i i J i i = J i i when a clarified choice is made how the tensor calculus terms should correspond ∼ =T to terms from λ-calculus.

Rules of inference
A logical inference, e.g., there is a need to distinguish two kinds of entities: • "the entities that the logical operations operate on, which we call propositions" 14 28 , which are "affirmed in an affirmation and denied in a denial" 14 . • and "the things that the logical laws, by which I mean the rules of inference, operate on, which we normally call assertions" 14 , which are "those that we prove and that appear as premises and conclusion of a logical inference" 14 .
We are examining that topic at this point in the paper for two reasons: One is in preparation of stating introduction rules for an augmented λ-calculus. The other is, because the word proposition has a different meaning in logic than it has in most of mathematics.
A logicians issue with the mathematical wording would be, that "a theorem is sometimes called a proposition, sometimes a theorem" 14 . And thus "we have two words for the things that we prove, proposition and theorem" 14 .
Now, "what we prove, in particular, the premises and conclusion of a logical inference" 14 are not called propositions, but judgments or assertions.
There is one technicality here that one might not even notice. Strictly speaking, the word judgement, or assertion, is used in particular for the premises and conclusion of a logical inference where it usually means an affirmation or denial. Most of modern logic gets along with just affirmations. A formula is not affirmed directly, but it has to be grasped as a proposition and that proposition then can be affirmed. When 27 P. Martin-Löf attributes it to B. Russell, translating Frege's Urteil into assertion, and calling the combination of Frege's judgment stroke "|" and content stroke "−" the assertion sign " ". 28 that use of the word proposition is again attributed to B. Russell should be a rule of disjunction introduction, then grasping A and B as propositions, A prop and B prop do figure as premises for that rule although they are not an affirmation nor a denial. P. Martin-Löf extends a use of the word judgment to include such new forms of judgment which are not only affirmations or denials anymore. Extending that usage allows us to denote premises and conclusion of an inference as judgments of some specific form. This wording is important, because for typed λ-calculus our goal is the derivation of judgments the form Γ T : a which means T has type a in context Γ. These judgments appear as the premises and conclusion of type checking rules of inference. There are three basic introduction type checking rules of typed λcalculus: introducing λ-abstraction, introducing function application and introducing variable usage.

Typed λ-calculus
In order to formalize the previously motivated application in Sec. 2, we spoke about univariate and multivariate functions, tuples made of scalars and indices for various operations on tuples and multivariate functions. These all make valid types in our consideration and therefore we model a "Type" in our augmented λ-calculus to be introduced by the following introduction rules: These mean, that there is a type for univariate functions "fun11" and a type for scalars. For every natural number there is one type of indices, one type of functions taking m arguments to a single output "funM1" and one type of functions taking a single input to n outputs "fun1N". For every two natural numbers m and n there is a function type taking m inputs to n outputs "funMN". These are purely syntactical introduction rules that are named semantically but do not have their intended meaning yet. But this "Type" serves as index set over which we will define the family of valid terms meaning we regard the totality of terms partitioned by their "Type".
These rules, in a very exact sense, correspond to a datatype definition in the Agda 16  We have, that for every "Type" that can be introduced by our stated introduction rules, there is exactly one element in the datatype that we have defined within the Agda language and vice versa. This property makes it suitable to support our formalization as we go along.
Here N is inductively defined in the usual way which is not much of interest here. For the formalization we introduced also the totality "Name" of names where variables are chosen from, but this also a minor point.
Some of the introduction type checking rules of λcalculus in general are λ-abstraction and function application. Where the latter usually is denoted just by juxtaposition, without an explicit operator, we emphasize this by the use of and .
The first rule of λ-abstraction for a and b being types in our λ-calculus and x being a name is written as It takes us from the judgment that « in a context consisting of first, Γ and second, the variable x being of type a, within that context the term T is of type b » to the judgment that « in context Γ the term λ x . T is of type a → b ». 29 The Agda language, on the one hand can be introduced as a functional programming language that, on the other hand, is powerful enough to express constructive mathematics. Agda builds on top of a type theory, as introduced by P. Martin-Löf. It supports dependently typed pattern matching, using so-called Miller pattern unification, with Σ-types, inductive datatypes and universe polymorphism.
In our application in Sec. 2 we only needed to abstract over scalars or tuples, so a much stricter rule can be used for a formalization. We have chosen four much simpler rules instead which fix the types to the four combinations of tuples and scalars. You can find them in the appendix 6.3. The reason to take this simplification is that a following interpretation in section 4 will become easier having the types fixed. This is possible because we are not targeting to formalize a general purpose programming language, but rather a very specific one just targeting the partial derivative.
A second rule introduces function application: It takes us from the two judgments that « in a context Γ the term T is of type a → b » and « in the same context Γ the term U is of type a » to the judgment « in context Γ again, T U is of type b ». In our formalization we chose to have four such introduction rules operating on the corresponding argument types. One for each type of function.

De Bruijn indices
There is one very basic key technique to work out for developing sane introduction rules that really respect the typing of variables with respect to some context Γ. It is noteworthy, that this technique is necessary to produce correctness guarantees from a programming language's type checker as motivated in Sec. 1.3. Unfortunately it is only expressable in a dependently typed programming language. In other programming languages, the following rules reduce to a list data structure.
As introduction rules for a context we chose that there is an empty context [] : Context and, when given a context Γ, we can form a new one Γ, (x, a) for every name-type combination (x, a).

Γ : Context
x : Name a : Type (−, (−, −)) Γ, (x, a) : Context These rules make a context to a list of tuples containing a name and a type in our consideration. The de Bruijn indices 11 that we are going to work out will be indices that are guaranteed by their type to really point to a specific name-type combination within such context. One could even model a context as a list of just types and without the names. Within the Agda formalization we found it very expressive to have this redundant piece of information available. Still, a variable is to be identified by its de Bruijn index and not by its name.
The first rule introduces a judgment that « de Bruijn index zero is an element of the type of de Bruijn indices that show the first name-type combination of a context being in that context ».

Γ : Context
x : Name a : Type (zero) zero : (x, a) ∈ Γ, (x, a) The second rule introduces a judgment that « an incremented de Bruijn index shows that a name-type combination is part of an appended context, given that it did so before ».
With these rules it is possible to give meaning to the last standard introduction type check rule of λ-calculus. It introduces the judgment that « in context Γ the variable x is of type a, because of ∵ the de Bruijn index i b ».
Where this matches closely our Agda formalization, the rule is sometimes written more intuitively as x : a ∈ Γ (Var) Γ x : a or even x ∈ Γ (Var). Γ x : Γ(x)

Augmenting λ-calculus
By just using the λ-calculus we got into the previous three rules and the use of de Bruijn indices even without any specifics from our application. After paying that entry fee for which do not exist many alternatives, we can finally work-in our application specific operations which are: substitiution of the i-th component, projecting out the i-th component, composition of functions, the univariate derivative and scalar multiplication.
For the previously introduced types in our specific λcalculus, we introduced six rules for substitution, six rules for projection, four obvious rules for composition of functions and one rule for the univariate derivative as well as one rule for scalar multiplication. You can find the rules in the appendix 6.3 and the Agda datatype of all well-formed λ-calculus terms is in Fig. 7.

Chain of Justification
All the data structures and data transformations described in Sec. 2, represent computations for the partial derivative function. But even after translating them into an augmented λ-calculus, they are not yet more than the mere skeletons carrying around meta-data. All the transformations we now implement on these λterms which should respect this, yet hypothetical, computation are just operations transforming that metadata.
The resulting λ-terms can only be turned into a computation when a lower layer, i.e. an implementation providing the univariate derivative and a representation of functions, providing these computations, is present such that the terms can be interpreted, i.e. turned into a computation and executed.
There are just a few properties even possible to be proven without further assumption at this high level. We have made the distinction between a computational equivalence ≡ that is justified within our investigation by the computational equivalences of the λcalculus and the propositional equality = that is used when a property of the univariate derivative , that operation we presupposed for our whole consideration, was made use of.
For the computational equivalences ≡ there is some chance to express those in terms of α-conversion, βreduction and η-reduction. But the provability of the equivalences denoted by = depends on the underlying interpretation.
Consistency of computational equivalence resulting from the presented transformations depends on a consistent implementation of the considered layer, of course, and precisely on a consistent implementation of  Figure 7: The datatype "Term" of well formed terms for an augmented λ-calculus suitable to express the partial derivative within the Agda programming language. Fin k is the type of natural numbers less than k, sometimes denoted N k or N <k . these two equality-transformations of the lower layer. These two equality-transformations are in some sense dependencies of our considered layer. The benefit is that the implementation of the considered layer can be verified in a way independently from a lower level application increasing the overall trust and decomposing monolithic software ventures into more modular ones. Similar to the two assumptions = lin and = chain of the univariate derivative, it is possible to determine additional assumptions that are necessary in proofs of additional theorems. In Sec. 4 we give guidance how these rather abstract assumptions become more concrete with a chosen interpretation for the augmented λ-calculus.

Software Architecture
In Sec. 3 a family of datatypes for well-formed terms in an augmented λ-calculus was set up. That family was indexed by a "Type" being the term's type and a "Context", to realize a valid use of variables. This should serve as a foundation for an implementation of our application from Sec. 2. Objects and equivalences from that application, the partial derivative and the multivariate chain rule, can be expressed within this λcalculus. But while guaranteeing these translations to be well-formed terms, this still does not make a computation. In Sec. 1.2 it was motivated how the formulation of the construction of a shape function element within a programming language easily leads to a metaimplementation. The meta-implementation's purpose is to generate an efficient implementation where the  Figure 8: Objects in a computer program involved in an electromagnetic transformation. The blue bits tag number data used when requesting particular physical field numbers e.g. for computing a numerical quadrature. The green bits tag predominant computations and fields necessary to be represented in the computers memory. It is important to note, that an electromagnetic field quantity is represented as one of the green bits, even though its contained degrees of freedom are thought of being blue bits at first. meta-implementation itself should be focused on validity rather than efficiency. Our approach in Sec. 3 should enable to achieve a high validity in a metaimplementation.
We are now focusing on how to give the λ-terms a suitable interpretation: Our approach has changed the task of giving a direct interpretation to partial derivatives, into the task of giving a direct interpretation to some lower level primitives. These are: variable access, quantification, substitution, projection, composition and the two special functions which were univariate derivative and a scalar multiplication. Replacing one notion of partial derivative 30 , with these lot of operations seems quite a lot of machinery and not worth the trade.
At this point we argue that: First, this approach is in some sense minimal. Decomposing partial derivative into more basic notions as in equation 23 involves only a notion of univariate derivative of a certain function 31 . With our elaboration in section 2 we collected what obligations arise when working out such decomposi-30 or an evaluation of Jacobian matrices if you like so 31 which is correspondence with what M. Spivak mentioned as ordinary derivative and cited in equation 8 tion in a way, precise enough to reach a level of "correctness and completeness necessary to get a computer program to work" 20 . This level might be "a couple of orders of magnitude higher" 20 than the level it needs to convince humans.
Second, one might have nothing more than coordinate transition functions in an implementation on a machine as representations of the data of boundary value problems. Recall that motivation from section 1.1 which promised one generic rule for coordinate transformations, indexed by three indices (p, q, ω). These should cover the electromagnetic quantities of interest that will show up in an implementation, applying techniques mentioned in section 1.2. There we have identified as an important ingredient the degrees of freedom which were mappings from a polynomial differential form u to a real number: Here we would identify u as the section of an associated bundle as in section 1.1. Since this section might not be directly representable within the ma-chine's memory, we would handle a coordinate representation of it instead. This was denoted xyz σ in Fig. 2 or phys. field numbers corresp. to xyz in Fig. 8.
Evaluation of these integrals in order to produce data for a discrete linear system to be solved, is usually done by numerical quadrature. That quadrature queries the physical field quantitie's value at some specific coordinates. If one cannot, or does not want to, predict which physical field values will be queried, it might appeal to represent the physical field as a computation. Of course, that computation might internally interpolate field values at some specific coordinates with polynomials, as is the case with the polynomial differential forms.
When speaking of finite elements, these are usually transformed to a reference element already in order to evaluate a numerical quadrature. Such approach is desired, since it allows the quadrature to be implemented in a fixed way with precalculated coefficents. Therefore we might say that every computer program implementing this technique has to deal with partial derivatives in some way already. But having just one integral transformation might not be worth the effort we have made in the previous part of this paper. We think that a construction of the boundary value problem itself might be given in terms of a longer chain of composed coordinate transformations. The representation of a boundary value problem could then be internally encoded as an equivalence transformation out of primitives. But this is just a motivation for our approach.
We argue it to be, at least, a justified perspective that these kinds of coordinate transformations 32 apply for a wide range of numerical software as sketched in section 1.2. Here the particular focus was on boundary value problems expressable by some Hodge-Laplacian over a manifold, discretized with a simplicial complex.
In a broader sense, we understand a part of software as a mapping of mathematical models for coordinate transformations, given by (1) in terms of the partial derivative, into a programming language. This is done by decomposing this mapping into first, a mapping of partial derivative into a formal language based on λcalculus, and second, a mapping of these λ-calculus terms to computations in a concrete programming language. Benefit arises, since the first mapping can be discussed and justified on a theoretical basis, where the second is much more arbitrary in its nature: arbitrary in a sense that we have to deal with different computational models to create programs that run on different kinds of machines. Now, we will elaborate on some of 32 or integral transformations if you like so these more arbitrary ways to map our specific λ-terms to a concrete programming language, or rather map them to a model of evaluation coming with such concrete programming language. A concrete programming language for that purpose comes with a syntax and an evaluation strategy.

Interpreting λ-terms formally
A straight-forward way, since we already have an Agda formalization, would be to continue here by implementing an evaluation function. Doing so is a typical task and two new concepts occur: for a context Γ and one of our custom types a, the evaluation function "eval" maps an environment of that context and a λ-term of that type and context to a value of the interpretation of a.
eval : Environment Γ → Term Γ a → Interpretation a Here, "Interpretation" maps our custom types from Fig.  6 to types of the Agda language which are elements of the universe "Set": An interpretation of our λ-term's types then really is captured by a function from our previously defined "Type" to the types of the programming language which is Agda in this case.
We showed in Sec. 3 that a context -in the way we introduced it -can be regarded as a list holding multiple variable name and type combinations. An environment for such context can be regarded as holding the corresponding values of these types. We might only use an environment by looking up variables, which are de Bruijn indices in our formalization: This shows that the choice of implementing an environment is already a little less fixed. We could mimick the context and use a list, but in contrast to the context, the environment will be present in our evaluation's computation where we might forget about the context completely. One might prove within Agda that an implementation of lookup never fails when given a valid de Bruijn index and then strip all the type information, revealing bare computations. Therefore with the environment, we do want to incorporate some aspects of performance. Speaking of performance, we might have a hard time continuing to use the Agda language itself as a target for evaluation. It is possible to implement a numerical software within this language 33 , but this programming language's environment offers only a limited help to put the machine into its most efficient state for the purpose of a computation. A typical goal is to map scalars to unboxed 34 floating point machine numbers which lack a lot of the properties of their counterparts from R. The Agda programming language offers more support for showing properties on rationals or constructive real numbers. Unfortunately, these numbers tend to have a representation making them unsuited for numerical computations. But that is a usual trade we have a lot with numerical algorithms: once their theory is worked out for exact real numbers and the algorithm is stable, then we do apply it on inexact floating point machine numbers, fingers crossed 35 .
What we also get with the evaluation as a mapping is the possibility to proof preservation of the = lin and = chain equivalences from Sec. 2 for our intended implementation. If we chose to interpret our custom λcalculus functions really as functions, then in particular the univariate derivative might be chosen to be an inexact black-box operation such as the difference quotient. In that case, the chances are high that we will loose the possibility to exactly proof that = lin and = chain are preserved by our evaluation function even if operating on exact rational numbers. That could be intended and we might formally track error bounds with all our operations to proof that the error resulting from this operation amortizes comparing to some other error. But the more interesting case would be, to interpret λ-functions not as functions but as data structures with a more interesting interpretation of λ-functionapplication as a data transformation. This perspective is elaborated in section 4.3.

Interpreting λ-terms less formally
One might have a formal model of a programming language at hand such that λ-terms can be translated. Depending on the degree of formalism, surjectivity of the eval function can be proven or even that the resulting 33 The Agda language is implemented in Haskell and can use Haskell methods, which in turn via a foreign function interface can call arbitrary system libraries 34 Programming languages with automatic reference and memory management tend to implicitly attach typing information to a value to be able treating this value via references instead. This is done because references into memory on a machine have a uniform representation. It is called "boxing" of a value. Boxing often demands memory allocation. Preventing frequent memory allocation, e.g. perscalar memory allocation, is very important to achieve high efficiency in an implementation. 35 Unless using promising techniques such as interval arithmetic to provide guarantees for this approach terms are still well-defined in the target language. This is essentially some type of code generation, where the weakest variant would be to interpret all our custom λterms within the string monoid, calling it "code". Even if one does not formally model the target language, this still gives a possibility to implement transformations at the λ-term level, before they are evaluated to code. Although introducing a large margin for interpretation and bugs, this approach could be well suited for generating code running in a very limited, e.g. lock-step, environment.

Interpreting λ-terms as data transformations
As motivated before, an interpretation of λ-functions to data structures of a target language might currently be the most rewarding one. For the finite element spaces from our application, a lot of different multivariate polynomial functions have to be operated with. These functions can be represented by polynomial coefficients, together with a custom function application operation that does use these coefficients to compute the polynomial. Furthermore, the univariate derivative operation on such coefficient representation is not only a very cheap one, but also exact when using exact number representations. That enables to proof the eval function to preserve = lin and = chain computationally.
In a meta-implementation for generating an efficient implementation it seems resonable to start out with an initial candidate for the implementation and then apply rewrite rules to optimize this implementation. Implementing correct rewrites and data transformations is where Agda, and functional programming in general, shines. The reason for that is an inductive definition of the data structures in question which enables an exhaustion check to proof functions to be total, i.e. not having missed a case. This usually pays off in caseanalysis-heavy applications such as designing domain specific languages, as we do here, or improving the encoding of a data structure. As for multivariate polynomials, these can obviously be represented as packed chunks of computer memory, holding their coefficients. But we might add some information or invent an interesting reference type for better tying them to the simplicial complex they are originating from. Having equivalence proven for one obvious encoding it can be easier for a new encoding to show it isomorphic, transferring the proofs.

Conclusion
We have explained transformations on the partial derivative in terms of computational notions from λcalculus with an additional term substitution. This mechanism has been implemented to generate listings for the general case as in appendix A and for all concrete multivariate cases, indexed by j ∈ N and i ∈ [1, j], exemplary for j = 2 and i = 1 as in appendix B, out of the same internal representation. It was argued, what general obligations arise when translating the theory into a computational layer of abstraction, for which the λ-calculus served as a model. We showed how a translation into an augmented λ-calculus can be formalized within type theoretical terms and implemented that formalization in the Agda programming language. Finally, we gave some examples how to make use of the presented approach and favorized one particular possibility. Our current research is about this exact undertaking and the foundational considerations are shown in our contribution.
Small programs as well as big software, no matter whether directly implementing this layer or not, will suffer from the inevitable tediousness of coordinate transformations when exploiting these techniques too much. That does not pose a problem when being aware of this issue and actively increasing rigor if this kind of complexity gets out of control. We have presented a way to establish that direction of rigor, motivated by the application of encoding the transformation laws common to the electromagnetic theory. Accompanying that way is an interpretation to guide an implementation demanding it.

Appendix
In the appendix we give a listing of the computational equivalences used to demonstrate the dependencies of the notion of partial derivative and the chain rule of the partial derivative on the notion of univariate derivative and the corresponding univariate chain rule. Both listings have been created out of the same internal representation with the rules of parameter-pack expansion borrowed from the C++ programming language, with the help of our own implementation of the parameterpack expansion, supporting the mentioned substitution. For the expanded listing in 6.2 we chose f, g : R 2 → R 2 and i = 1.
Furthermore we attached a translation of the Agda datatype of well-formed λ-terms from Fig. 7.