Warburton Research Group

Home Team Projects Software Publications Support Lectures

Tim Warburton is the John K. Costain Faculty Chair in the College of Science, and Professor of Mathematics at Virginia Tech. He is also affiliated with the Computational Modeling and Data Analytics program at Virginia Tech. His research team has ongoing projects related to developing new numerical methods and computational tools for physical modeling.

The inability to predict lasting languages and architectures led us to develop OCCA, a C++ library focused on host-device interaction. Using run-time compilation and macro expansions, the result is a novel single kernel language that expands to multiple threading languages. Currently, OCCA supports device kernel expansions for the OpenMP, OpenCL, and CUDA platforms. Computational results using finite difference, spectral element and discontinuous Galerkin methods show OCCA delivers portable high performance in different architectures and platforms.

Native APIs: MATLAB, Python, Julia, C++, C#, C, F90. Optional back-ends: OpenCL, CUDA, OpenMP, Pthreads, Intel COI.

To obtain the solutions of large sparse linear system of equations arised from the discretization of elliptic partial differential equations. The method uses an aggregation strategy based on parallel maximal independent set algorithm and defines piecewise constant interpolations from coarse to fine grids. The solution procedure uses optimal Krylov accelerated cycling strategy (K-cycle). The implementations use OCCA2 for portability across several hardware architectures and multi-threading approaches.

To predict the tsunami wave propagation using discontinuous Galerkin methods on triangular meshes for two dimensional Shallow water equations. The method uses multirate Adams-Bashforth method for time integration, a positivity preserving method, and a slope limiter for stability of the numerical scheme. The implementations use OCCA2 for portability across several hardware architectures and multi-threading approaches.

To simulate a laser intensity in laser-induced thermo therapy, we developed gLaser application which solves Radiative Transport Equation in biological tissue. The algorithm applies Spectral Element method in space and sphercial harmonics in the velocity field. The implementation uses OCCA which enables the app to run in difference parallel modes.

We employ Hermite methods to discretize a variety of Hyperbolic Initial Value Problems. Building on Hermite interpolation, these methods allow for the construction of arbitrary order methods based on 2k-point stencils in k spatial dimensions. A remarkable feature is that the CFL condition of the resulting scheme is independent of the approximation order. The implementation uses OCCA2 for portability across various computing architectures.

Nodal Discontinuous Galerkin Methods: my interests include developing practical numerical methods for time-dependent electromagnetics, acoustics, computational seismology, and fluid dynamics and also steady state elliptic problems. In particular I am working to improve the efficiency and robustness of these methods through the use of local time-stepping methods, artificial viscosity for shock capturing, preconditioners, and convergent adaptive schemes.

HPC: My team at VirginiaTech is part of the Argonne Exascale Co-Design center: CESAR. We are developing programming tools for programming at exascale. I am working with David Medina to develop a new multi-threading framework, OCCA, that provides a unified abstract interface and compute kernel programming model that interfaces behind the scenes with OpenCL, CUDA, pThreads, and OpenMP threading models.

Applications: I am collaborating with the Shell Computational and Modeling group to develop a world class seismic inversion simulation platform based on the GPU accelerated discontinuous Galerkin methods that my research team has developed. I am also working with researchers at the MD Anderson Cancer Center to develop a prototype therapy modeling tool for the Magnetic Resonance Guided Laser Instertitial Thermal Therapy (MRgLITT) that can provide predictive capabilities..

Recent Activity:

Accelerating Shallow Water Modeling(web link).

OCCA: An Extensible Portability Layer for Many-Core Programming (web link, slides).

Tutorial slides: (web link).

Accelerating High-order Methods, International Conference on Spectral and High-order Methods (abstract).

Low Storage Curvilinear Discontinuous Galerkin Methods (abstract).

OCCA: A Unified Approach to Multi-Threading Languages.

OCCA: A Unified Approach to Multi-Threading Languages, (program).

Many-core Algorithms for High-order Finite Element Methods: when time to solution matters (slides).

Research Interests: High order finite element methods, adaptivity, DG, hybridized DG, stabilized methods, least squares methods, wave equations, computational fluid dynamics.

Projects: Adaptive Plane Wave DG, Discontinuous Petrov-Galerkin methods

Recent Activity:

Computers and Fluids.

Computers and Mathematics with Applications.

Research Interests: computational methods for wave-like problems (in acoustics, electromagnetism, elastodynamics, oceanography, ...), absorbing boundary conditions, absorbing layers, perfectly matched layers, ...

Projects: Computational Methods for Seismic Imaging

Recent Activity:

Optimizing perfectly matched layers in discrete contexts (Web Link)

Research Interests: Numerical PDEs, High Performance Computing

Projects:

Recent Activity:

Discontinuous Galerkin Method in the Full-Waveform Inversion Framework

GPU Accelerated Lattice Boltzmann Method in Core Sample Analysis

Ali
Karakuş

Jean-François
Remacle

Research Interests:

Projects:

Recent Activity:

Research Interests: High performance numerical methods for scientific computing applications

Projects:

Recent Activity:

A GPU Accelerated Aggregation Algebraic Multigrid Method (Web Link)

GPU Accelerated Numerical Methods For Tsunami Modeling (Web Link)

ALMOND: Algebraic Multigrid On Numerous Devices (Web Link)

GPU Accelerated Discontinuous Galerkin Methods for Shallow Water Equations (ArXiv Link)

GPU Accelerated Numerical Methods For Tsunami Modeling

High-order Numerical Methods for High-Contrast Seismic Imaging

Research Interests: Fast methods for solving discrete systems arising from high order finite element methods for time and frequency domain wave propagation.

Projects: Adaptive plane wave DG.

Recent Activity:

Rapidly Iterating from Prototype to Near-C Performance in Julia: A Finite Element Method Case Study. (Video)

Parallel Adaptive Trefftz Discontinuous Galerkin Methods.

Explicit Discontinuous Galerkin Methods for Linear Hyperbolic Problems.

Research Interests: Device-accelerated algorithms for numerical methods

Projects:

Recent Activity:

High-Order Finite-differences on multi-threaded architectures using OCCA (ArXiv Link)

OCCA: A Unified Approach to Multi-threading Languages (Link)

OCCA: A Unified Approach to Multi-threading Languages (Program Link)

OCCA: A unified approach to multi-threading languages

OCCA: A unified approach to multi-threading languages (ArXiv Link)

GPU Accelerated Discontinuous Galerkin Methods for Shallow Water Equations (ArXiv Link)

GPU Accelerated Numerical Methods for Tsunami Modeling

High-order Numerical Methods for High-Contrast Seismic Imaging

Tools for multi-core computing:

OCCA2: The inability to predict lasting languages and architectures led us to develop OCCA, a C++ library focused on host-device interaction. Using run-time compilation and macro expansions, the result is a novel single kernel language that expands to multiple threading languages. Currently, OCCA supports device kernel expansions for the OpenMP, OpenCL, and CUDA platforms. Computational results using finite difference, spectral element and discontinuous Galerkin methods show OCCA delivers portable high performance in different architectures and platforms.

Native APIs: MATLAB, Python, Julia, C++, C#, C, F90. Optional back-ends: OpenCL, CUDA, OpenMP, Pthreads, Intel COI.

ALMOND: algebraic multigrid solver accelerated with OCCA2

Features:

Non-smooth prolongation/restriction based aggregation multigrid.
OCCA accelerated on-device AGMG setup and AMG cycling.
Black-box AGMG.
MATLAB interface.
Dependencies: OCCA2.

Unstructured grid discontinuous Galerkin based solvers:

MIDG2: A time-domain Maxwell's solver based on discontinuous Galerkin discretization in space and low-storage Runge-Kutta discretization in time accelerated by the OCCA2 extensible many-core programming interface. Capabilities include: Maxwell's equations solved in 2D and 3D on unstructured grids, hybrid MPI+OCCA2 parallelism.

pasidg: A time-dependent shallow water equation solver. Capabilities: reads GEBCO bathymetry data (link), GSHHS coastline data (link), gmsh meshing (link), interactive global region selection, multirate linear multistep local time-stepping, positivity preserving limiter, TVB limiting, and OCCA2 based acceleration.

RiDG: A time-dependent linear acoustic-elastic water equation solver. Capabilities: multirate linear multistep local time-stepping, MPI+OCCA2 hybrid parallelism, acoustic/vti/tti/elastic modules, disk-free reverse time migration, consistent temporal correlation, time-reversed multirate time-stepping.

gNUMA:

Spectral element based solvers:

brainNek:

Solver for Pennes bioheat equation.
Spectral element method based spatial discretization.
OCCA2 acceleration.
Models tissue inhomogeneity, probe water-cooling, tissue coagulation.
Choice of light fluence model: diffusion approximation for radiative transfer equation or Green’s function method.
Choice of preconditioner: 1 point, 1 level overlapping additive Schwartz preconditioner, Jacobi preconditioner, or no preconditioner.
Test cases: laser point source & comparison with empirical MRTI data.

vNek:

J. S. Hesthaven and T. Warburton. Nodal discontinuous Galerkin methods: Algorithms, analysis and applications. Springer, 978-0-387-72065-4, Texts in Applied Mathematics 54, 2008.

Back cover: “This book discusses a family of computational methods, known as discontinuous Galerkin methods, for solving partial differential equations. While these methods have been known since the early 1970s, they have experienced an almost explosive growth interest during the last ten to fifteen years, leading both to substantial theoretical developments and the application of these methods to a broad range of problems. These methods are different in nature from standard methods such as finite element or finite difference methods, often presenting a challenge in the transition from theoretical developments to actual implementations and applications. This book is aimed at graduate level classes in applied and computational mathematics. The combination of an in depth discussion of the fundamental properties of the discontinuous Galerkin computational methods with the availability of extensive software allows students to gain first hand experience from the beginning without eliminating theoretical insight.”

Chinese translation: 交点间断Galerkin方法:算法、分析和应用 by Jichun Li and Tao Tang, amazon.cn

Preprints:

David S Medina, Amik St-Cyr, and T. Warburton. High-Order Finite-differences on multi-threaded architectures using OCCA, arXiv preprint, 2014.

Abstract: “High-order finite-difference methods are commonly used in wave propagators for industrial subsurface imaging algorithms. Computational aspects of the reduced linear elastic vertical transversely isotropic propagator are considered. Thread parallel algorithms suitable for implementing this propagator on multi-core and many-core processing devices are introduced. Portability is addressed through the use of the OCCA runtime programming interface. Finally, performance results are shown for various architectures on a representative synthetic test case.”

David S Medina, Amik St-Cyr, and T. Warburton. OCCA: A unified approach to multi-threading languages, arXiv preprint, 2014.

Abstract: “The inability to predict lasting languages and architectures led us to develop OCCA, a C++ library focused on host-device interaction. Using run-time compilation and macro expansions, the result is a novel single kernel language that expands to multiple threading languages. Currently, OCCA supports device kernel expansions for the OpenMP, OpenCL, and CUDA platforms. Computational results using finite difference, spectral element and discontinuous Galerkin methods show OCCA delivers portable high performance in different architectures and platforms.”

Shelvean Kapita, Peter Monk, and T. Warburton. Residual based adaptivity and PWDG methods for the Helmholtz equation, arXiv 2014.

Abstract: “We present a study of two residual a posteriori error indicators for the Plane Wave Discontinuous Galerkin (PWDG) method for the Helmholtz equation. In particular we study the h-version of PWDG in which the number of plane wave directions per element is kept fixed. First we use a slight modification of the appropriate a priori analysis to determine a residual indicator. Numerical tests show that this is reliable but pessimistic in that the ratio between the true error and the indicator increases as the mesh is refined. We therefore introduce a new analysis based on the observation that sufficiently many plane waves can approximate piecewise linear functions as the mesh is refined. Numerical results demonstrate an improvement in the efficiency of the indicators.”

R. Gandham, D.S. Medina and T. Warburton. GPU Accelerated Discontinuous Galerkin Methods for Shallow Water Equations, arXiv 2013.

Abstract: “We discuss the development, verification, and performance of a GPU accelerated discontinuous Galerkin method for the solutions of two dimensional nonlinear shallow water equations. The shallow water equations are hyperbolic partial differential equations and are widely used in the simulation of tsunami wave propagations. Our algorithms are tailored to take advantage of the single instruction multiple data (SIMD) architecture of graphic processing units. The time integration is accelerated by local time stepping based on a multi-rate Adams-Bashforth scheme. A total variational bounded limiter is adopted for nonlinear stability of the numerical scheme. This limiter is coupled with a mass and momentum conserving positivity preserving limiter for the special treatment of a dry or partially wet element in the triangulation. Accuracy, robustness and performance are demonstrated with the aid of test cases. We compare the performance of the kernels expressed in a portable threading language OCCA, when cross compiled with OpenCL, CUDA, and OpenMP at runtime.”

Accelerated Numerical Methods:

Andreas Kloeckner, Timothy Warburton and Jan S Hesthaven, High-Order Discontinuous Galerkin Methods by GPU Metaprogramming, Book: GPU Solutions to Multi-scale Problems in Science and Engineering, Series: Springer Lecture Notes in Earth System Sciences 2013, pp 353-374, 2013.

Abstract: “Discontinuous Galerkin (DG) methods for the numerical solution of partial differential equations have enjoyed considerable success because they are both flexible and robust: They allow arbitrary unstructured geometries and easy control of accuracy without compromising simulation stability. In a recent publication, we have shown that DG methods also adapt readily to execution on modern, massively parallel graphics processors (GPUs). A number of qualities of the method contribute to this suitability, reaching from locality of reference, through regularity of access patterns, to high arithmetic intensity. In this article, we illuminate a few of the more practical aspects of bringing DG onto a GPU, including the use of a Python-based metaprogramming infrastructure that was created specifically to support DG, but has found many uses across all disciplines of computational science.”

A. Kloeckner, T. Warburton and J.S. Hesthaven. Solving Wave Equations on Unstructured Geometries, GPU Computing Gems Jade Edition, Editor Wen-mei Hwu, Morgan Kaufmann Publishers, August 2011.

Abstract: “Discontinuous Galerkin (DG) methods for the numerical solution of partial differential equations have enjoyed considerable success because they are both flexible and robust: They allow arbitrary unstructured geometries and easy control of accuracy without compromising simulation stability. Lately, another property of DG has been growing in importance: The majority of a DG operator is applied in an element-local way, with weak penalty-based element-to-element coupling. The resulting locality in memory access is one of the factors that enables DG to run on off-the-shelf, massively parallel graphics processors (GPUs). In addition, DG’s high-order nature lets it require fewer data points per repre- sented wavelength and hence fewer memory accesses, in exchange for higher arithmetic intensity. Both of these factors work significantly in favor of a GPU implementation of DG. Using a single US$400 Nvidia GTX 280 GPU, we accelerate a solver for Maxwell’s equations on a general 3D unstructured grid by a factor of 40 to 60 relative to a serial computation on a current-generation CPU. In many cases, our algorithms exhibit full use of the device’s available memory bandwidth. Example computations achieve and surpass 200 gigaflops/s of net application- level floating point work. In this article, we describe and derive the techniques used to reach this level of performance. In addition, we present comprehensive data on the accuracy and runtime behavior of the method. ”

Nico Goedel, Steffen Schomann, Tim Warburton, and Markus Clemens, GPU Accelerated Adams- Bashforth Multirate Discontinuous Galerkin Simulation of High Frequency Electromagnetic Fields, IEEE Transactions on Magnetics, Volume 46, No. 8, Pages 2735-2738, August 2010.

Abstract: “A multirate Adams-Bashforth (AB) scheme for simulation of electromagnetic wave propagation using the discontinuous Galerkin finite element method (DG-FEM) is presented. The algorithm is adapted such that single-instruction multiple-thread (SIMT) characteristic for the implementation on a graphics processing unit (GPU) is preserved. A domain decomposition strategy respecting the multirate classification for computation on multiple GPUs is presented. Accuracy and performance is analyzed with help of suitable benchmarks.”

A. Kloeckner, Tim Warburton, Jeffrey Bridge, and Jan S Hesthaven. Nodal discontinuous Galerkin methods on graphics processors. Journal of Computational Physics 228 (21), 7863-7882, 2009.

Abstract: “Discontinuous Galerkin (DG) methods for the numerical solution of partial differential equations have enjoyed considerable success because they are both flexible and robust: They allow arbitrary unstructured geometries and easy control of accuracy without compromising simulation stability. Lately, another property of DG has been growing in importance: The majority of a DG operator is applied in an element-local way, with weak penalty-based element-to-element coupling. The resulting locality in memory access is one of the factors that enables DG to run on off-the-shelf, massively parallel graphics processors (GPUs). In addition, DG’s high-order nature lets it require fewer data points per represented wavelength and hence fewer memory accesses, in exchange for higher arithmetic intensity. Both of these factors work significantly in favor of a GPU implementation of DG. Using a single US$400 Nvidia GTX 280 GPU, we accelerate a solver for Maxwell’s equations on a general 3D unstructured grid by a factor of around 50 relative to a serial computation on a current-generation CPU. In many cases, our algorithms exhibit full use of the device’s available memory bandwidth. Example computations achieve and surpass 200 gigaflops/s of net application-level floating point work.”

Discontinuous Galerkin Methods:

T. Warburton, A Low Storage Curvilinear Discontinuous Galerkin Method for Wave Problems, SIAM Journal on Scientific Computing, Volume 35, Number 4, pages A1987-A2012, 2013.

Abstract: “The low-storage curvilinear discontinuous Galerkin (LSC-DG) method reduces the storage requirements for solving symmetric and linear linear symmetric conservation laws in curvi- linear domains when compared to the standard discontinuous Galerkin method. We perform a semidiscrete, a priori convergence analysis of LSC-DG and determine sufficient conditions on se- quences of meshes to guarantee the same rate of convergence as the original upwind DG method on curvilinear domains. Computational results confirm the optimal order convergence on sample mesh sequences that satisfy the sufficiency conditions. Additional results show that the sufficient conditions for optimal order convergence of LSC-DG may also be necessary conditions..”

C. Carstensen, R.H.W. Hoppe , N. Sharma, and T. Warburton, Adaptive Hybridized Discontinuous Galerkin Interior Penalty Methods for H(curl)-elliptic problem, Numerical Mathematics: Theory, Methods and Applications, Volume 4, pp. 13-27, 2011.

Abstract: “We develop and analyze an adaptive hybridized Interior Penalty Discontinuous Galerkin (IPDG-H) method for H(curl)- elliptic boundary value problems in 2D or 3D arising from a semi-discretization of the eddy currents equations. The method can be derived from a mixed formula- tion of the given boundary value problem and involves a Lagrange multiplier that is an approximation of the tangential traces of the primal variable on the interfaces of the underlying triangulation of the computational domain. It is shown that the IPDG-H technique can be equivalently formulated and thus implemented as a mortar method. The mesh adaptation is based on a residual-type a posteriori error estimator consisting of element and face residuals. Within a unified framework for adaptive finite element methods, we prove the reliability of the estimator up to a consistency error. The performance of the adaptive symmetric IPDG-H method is documented by numerical results for representative test examples in 2D.”

Annalisa Buffa, Ilaria Perugia, and Tim Warburton, The mortar-discontinuous Galerkin method for the 2D Maxwell eigenproblem, Journal of Scientific Computing, Volume 40, pp. 86-114, 2009.

Abstract: “We consider discontinuous Galerkin (DG) approximations of the Maxwell eigenproblem on meshes with hanging nodes. It is known that while standard DG methods provide spurious-free and accurate approximations on the so-called k-irregular meshes, they may generate spurious solutions on general irregular meshes. In this paper we present a mortar-type method to cure this problem in the two-dimensional case. More precisely, we introduce a projection based penalization at non-conforming interfaces and prove that the obtained DG methods are spectrally correct. The theoretical results are validated in a series of numerical experiments on both convex and non convex problem domains, and with both regular and discontinuous material coefficients.”

Numerical Analysis & Methods:

T. Warburton. An explicit construction of interpolation nodes on the simplex. Journal of engineering mathematics 56 (3), 247-262, 2006.

Abstract: “An open question concerns the spatial distribution of nodes that are suitable for high-order Lagrange interpolation on the triangle and tetrahedron. Several current methods used to produce nodal sets with small Lebesgue constants are recalled. A new approach is presented for building nodal distributions of arbitrary order, that is based on curvilinear finite-element techniques. Numerical results are shown which demonstrate that, despite the explicit nature of this construction, the resulting node sets are well suited for interpolation and competitive with existing sets for up to tenth-order polynomial interpolation. Matlab scripts which evaluate the node distributions on the equilateral triangle are included.”

T. Warburton and J.S. Hesthaven. On the constants in hp-finite element trace inverse inequalities. Computer methods in applied mechanics and engineering 192 (25), 2765-2773, 2003.

Abstract: “We derive inverse trace inequalities for hp-finite elements. Utilizing orthogonal polynomials, we show how to derive explicit expressions for the constants when considering triangular and tetrahedral elements. We also discuss how to generalize this technique to the general d-simplex.”

Mathematical result: Given a d-dimensional simplex $D$ then $\| u \|_{\partial D} \leq \sqrt{\frac{(N+1)(N+d)}{d} \frac{|\partial D|}{|D|}} \|u\|_D \; \forall \; u\in \mathbb{P}^N(D)$.

Radiation Boundary Conditions:

Thomas Hagstrom, Timothy Warburton, and Dan Givoli, Radiation Boundary Conditions for Time-dependent Waves Based on Complete Plane Wave Expansions, Journal of Computational and Applied Mathematics, Volume 234, Issue 6, Pages 1988-1995, July 2010.

Abstract: “ We develop complete plane wave expansions for time-dependent waves in a half-space and use them to construct arbitrary order local radiation boundary conditions for the scalar wave equation and equivalent first order systems. We demonstrate that, unlike other local methods, boundary conditions based on complete plane wave expansions provide nearly uniform accuracy over long time intervals. This is due to their explicit treatment of evanescent modes. Exploiting the close connection between the boundary condition formulations and discretized absorbing layers, corner compatibility conditions are constructed which allow the use of polygonal artificial boundaries. Theoretical arguments and simple numerical experiments are given to establish the accuracy and efficiency of the proposed methods”

T. Hagstrom and T. Warburton, Complete Radiation Boundary Conditions: minimizing the long time error growth of local methods, SIAM Journal on Numerical Analysis, Volume 47, Issue 5, pp. 3678-3704, 2009.

Abstract: “ We construct and analyze new local radiation boundary condition sequences for first-order, isotropic, hyperbolic systems. The new conditions are based on representations of solutions of the scalar wave equation in terms of modes which both propagate and decay. Employing radiation boundary conditions which are exact on discretizations of the complete wave expansions essentially eliminates the long time nonuniformities encountered when using the standard local methods (perfectly matched layers or Higdon sequences). Specifically, we prove that the cost in terms of degrees-of-freedom per boundary point scales with $\ln\frac{1}{\epsilon}\cdot\ln\frac{cT}{\delta}$, where $\epsilon$ is the error tolerance, T is the simulation time, and $\delta$ is the separation between the source-containing region and the artificial boundary. Choosing $\delta\sim\lambda$, where $\lambda$ is the wavelength, leads to the same estimate which has been obtained for optimal nonlocal approximations. Numerical experiments confirm that the efficiencies predicted by the theory are attained in practice.”

Thomas Hagstrom and Tim Warburton, A New Auxiliary Variable Formulation of High-Order Local Radiation Boundary Conditions: Corner Compatibility Conditions and Extensions to First Order Systems, Wave Motion, Vol. 39, 327-338, 2004.

Abstract: “ We present a new auxiliary variable formulation of high-order radiation boundary conditions for the numerical simulation of waves on unbounded domains. Retaining the flexibility of Higdon’s wave-product conditions, our approach allows arbitrary-order implementations. When applied to the scalar wave equation, the proposed method leads to balanced, symmetrizable systems of wave equations on the boundary. It can also be extended to first-order systems. Corner compatibility conditions are derived for the auxiliary variable equations. They are shown experimentally to lead to stable, accurate results.”

Warburton research funded in part by grants, contracts, and donations from:

ksjdlf