Publications

Recent

  1. Indiana University (Fox, Qiu, Crandall, von Laszewski), Rutgers (Jha), Virginia Tech (Marathe), Kansas (Paden), Stony Brook (Wang), Arizona State (Beckstein), Utah (Cheatham), “Datanet: CIF21 DIBBs: Middleware and High Performance Analytics Libraries for Scalable Data Science” presentation at “Advancing the state-of-the-art in high-performance computing, communications and data analytics” workshop at Pittsburgh PA May 17-18 2016.

Rutgers University

  1. Hadoop on HPC: Integrating Hadoop and Pilot-based Dynamic Resource Management Andre Luckow, Ioannis Paraskevakos, George Chantzialexiou, Shantenu Jha Workshop on High-Performance Big Data Computing, 2016 [link] [bib] : hadoop-on-hpc http://arxiv.org/abs/1602.00345
  2. Pilot-Abstraction: A Valid Abstraction for Data-Intensive Application on HPC, Hadoop and Cloud Infrastructures?, Andre Luckow, Pradeep Mantha, Shantenu Jha pdf version here
  3. A Tale of Two Data-Intensive Paradigms: Applications, Abstractions, and Architectures, Shantenu Jha, Judy Qiu, Andre Luckow, Pradeep Mantha, Geoffrey C. Fox 2014 IEEE International Congress on Big Data (BigData Congress), 2014 pdf version here

Stony Brook University and Emory University

  1. Ablimit Aji and Fusheng Wang (2016). Challenges and Approaches in Spatial Big Data Management. Big Data: Storage, Sharing, and Security (3S) Fei Hu. Auerbach Publications.
  2. Cong Xie, Wen Zhong, Jun Kong , Wei Xu, Klaus Mueller, and Fusheng Wang (2016). IEVQ: An Iterative Example-based Visual Query for Pathology Database. Proceedings of the Second International Workshop on Data Management and Analytics for Medicine and Healthcare.
  3. Hoang Vo, Jun Kong, Dejun Teng, Yanhui Liang, Ablimit Aji, George Teodoro and Fusheng Wang (2016). A MapReduce Based High Performance Whole Slide Image Analysis Framework in the Cloud. Proceedings of the Second International Workshop on Data Management and Analytics for Medicine and Healthcare.
  4. Jun Kong, Pengyue Zhang, Yanhui Liang, George Teodorou, Daniel J. Brat and Fusheng Wang (2016). Robust Cell Segmentation for Histological Images of Glioblastoma. International Symposium on Biomedical Imaging (ISBI 2016).
  5. Xin Chen and Fusheng Wang (). Integrative Spatial Data Analytics for Public Health Studies of New York State. Proceedings of AMIA 2016 Annual Symposium.
  6. Yanhui Liang, Jun Kong, Yangyang Zhu and Fusheng Wang (2016). Three-Dimensional Data Analytics for Pathology Imaging. First International Workshop on Data Management and Analytics for Medicine and Healthcare (DMAH 2015).
  7. Liver Whole Slide Image Analysis for 3D Vessel Reconstruction. Y. Liang, F. Wang, D. Treanor, D. Magee, G. Teodoro, Y. Zhu, J. Kong International Symposium on Biomedical Imaging: From Nano to Macro (ISBI’2015), Accepted, Brooklyn, NY, USA, April 16-19, 2015.
  8. Automated Cell Recognition with 3D Fluorescence Microscopy Images. J. Kong, F. Wang, G. Teodoro, Y. Liang, Y. Zhu, C. Tucker-Burden, D.J. Brat International Symposium on Biomedical Imaging: From Nano to Macro (ISBI’2015), Accepted, Brooklyn, NY, USA, April 16-19, 2015.
  9. A Framework for 3D Vessel Analysis using Whole Slide Images of Liver Tissue Sections, Y. Liang, F. Wang, D. Treanor, D. Magee, N. Roberts, G. Teodoro, Y. Zhu, J. Kong International Journal of Computational Biology and Drug Design. In Press
  10. High Performance Spatial Queries for Spatial Big Data: from Medical Imaging to GIS, F. Wang, A. Aji and H. Vo In Press. ACM SIGSPATIAL Special Issue, 2015
  11. Effective Temporal Modeling for Scalable Spatio-Temporal Queries. H. Vo and F. Wang To Appear in Proc. of the International Workshop on Spatiotemporal Computing (IWSC‘2015). Fairfax, Virginia, July 13-15th, 2015
  12. High Performance Integrated Spatial Big Data Analytics, X. Chen, H. Vo, A. Aji and F. Wang In Proc. of the Third ACM SIGSPATIAL International Workshop on Analytics for Big Geospatial Data (BigSpatial-2014), Nov 4, 2014, Dallas, TX, USA
  13. Haggis: Turbo Charge A MapReduce based Spatial Data Warehousing System with GPU Engine, Ablimit Aji, George Teodoro and Fusheng Wang In Proc. of the Third ACM SIGSPATIAL International Workshop on Analytics for Big Geospatial Data (BigSpatial-2014), Nov 4, 2014, Dallas, TX, USA
  14. SATO: A Spatial Data Partitioning Framework for Scalable Query Processing, A. Aji, G. Teodoro and F. Wang In Proc. of the 22nd ACM SIGSPATIAL International Conference on Advances in Geographic Information Systems (ACM SIGSPATIAL 2014), November 4-7, 2014, Dallas, TX, USA

Indiana University

  1. Supun Kamburugamuve, Pulasthi Wickramasinghe, Saliya Ekanayake, Chathuri Wimalasena, Milinda Pathirage, Geoffrey Fox, “TSmap3D: Browser Visualization of High Dimensional Time Series Data”, Technical report May 10 2016 pdf version here
  2. Saliya Ekanayake, Supun Kamburugamuve and Geoffrey Fox, “SPIDAL: High Performance Data Analytics with Java and MPI on Large Multicore HPC Clusters”, Technical Report January 5 2016, Proceedings of 24th High Performance Computing Symposium (HPC 2016), April 3-6, 2016, Pasadena, CA, USA as part of the SCS Spring Simulation Multi-Conference (SpringSim‘16) pdf version here
  3. Geoffrey Fox, Judy Qiu, Shantenu Jha, Saliya Ekanayake, and Supun Kamburugamuve, “Big Data, Simulations and HPC Convergence” Technical Report January 30 2016. DOI: 10.13140/RG.2.1.1858.8566 pdf version here
  4. Supun Kamburugamuve, Saliya Ekanayake, Milinda Pathirage, Geoffrey Fox, “Towards High Performance Processing of Streaming Data in Large Data Centers” Technical Report January 26 2016, to be published in proceedings of HPBDC 2016 IEEE International Workshop on High-Performance Big Data Computing in conjunction with The 30th IEEE International Parallel and Distributed Processing Symposium (IPDPS 2016),Chicago Hyatt Regency, Chicago, Illinois USA, Friday, May 27th, 2016 pdf version here
  5. Bingjing Zhang, Peng Bo, Judy Qiu, “Model Data-Centric Computation Abstractions in Machine Learning Applications”, to appear in 3rd Workshop on Algorithms and Systems for MapReduce and Beyond (BeyondMR2016), held in conjunction with SIGMOD/PODS2016, July 1, 2016 pdf version here
  6. Bingjing Zhang, Bo Peng, Judy Qiu, “High Performance LDA through Collective Model Communication Optimization”, Proceedings of International Conference on Computational Science (ICCS2016) conference, June 6-8, 2016, San Diego, California. pdf version here
  7. Binging Zhang,” A Collective Communication Layer for the Software Stack of Big Data Analytics”, Doctoral Symposium. Proceedings of IEEE International Conference on Cloud Engineering (IC2E2016) Conference, April 4-8, 2016, Berlin, Germany. here
  8. Towards HPC-ABDS: An Initial High-Performance Big Data Stack, in Building Robust Big Data Ecosystem, Judy Qiu, Shantenu Jha, Andre Luckow and Geoffrey C. Fox, ISO/IEC JTC 1 Study Group on Big Data. March 18-21, 2014. San Diego Supercomputer Center, San Diego pdf version here
  9. High Performance High Functionality Big Data Software Stack In Big Data and Extreme-scale Computing (BDEC), Geoffrey Fox, Judy Qiu, and Shantenu Jha Fukuoka, Japan, 2014 pdf version here
  10. A Tale of Two Data-Intensive Approaches: Applications, Architectures and Infrastructure, Shantenu Jha, Judy Qiu, Andre Luckow, Pradeep Mantha, and Geoffrey C. Fox In 3rd International IEEE Congress on Big Data Application and Experience Track. June 27- July 2, 2014. Anchorage, Alaska pdf version here
  11. HPC-ABDS Kaleidoscope of over 300 Apache Big Data Stack and HPC Technologies Accessed April 8, 2014 Available here
  12. Towards an Understanding of Facets and Exemplars of Big Data Applications, Geoffrey C. Fox, Shantenu Jha, Judy Qiu, and Andre Luckow In 20 Years of Beowulf: Workshop to Honor Thomas Sterling’s 65th Birthday, October 14, 2014. Annapolis, MD pdf version here
  13. Big Data Use Cases and Requirements, Geoffrey Fox and Wo Chang In 1st Big Data Interoperability Framework Workshop: Building Robust Big Data Ecosystem ISO/IEC JTC 1 Study Group on Big Data March 18 - 21, 2014. San Diego Supercomputer Center, San Diego, CA pdf version here
  14. NIST Big Data Use Case & Requirements 2013 [accessed March 1, 2015] Available here
  15. Ogres: A Systematic Approach to Big Data Benchmarks, Geoffrey C. Fox, Shantenu Jha, Judy Qiu, and Andre Luckow In Big Data and Extreme-scale Computing (BDEC) January 29-30, 2015. Barcelona pdf version here
  16. Towards a Comprehensive Set of Big Data Benchmarks, Geoffrey C. Fox, Shantenu Jha, Judy Qiu, Saliya Ekanayake, and Andre Luckow February 15, 2015 pdf version here
  17. ESTIMATING BEDROCK AND SURFACE LAYER BOUNDARIES AND CONFIDENCE INTERVALS IN ICE SHEET RADAR IMAGERY USING MCMC, Stefan Lee, Jerome Mitchell, David J. Crandall, and Geoffrey C. Fox The International Conference on Image Processing (ICIP), Paris, France. October 27-29, 2014 pdf version here
  18. Harp: Collective Communication on Hadoop, Bingjing Zhang, Yang Ruan, and Judy Qi In IEEE International Conference on Cloud Engineering (IC2E), March 9-12, 2015. Tempe, AZ pdf version here
  19. HPC-ABDS: High Performance Computing Enhanced Apache Big Data Stack. Geoffrey Fox, Judy Qiu, Shantenu Jha, Supun Kamburugamuve and Andre Luckow Invited talk at 2nd International Workshop on Scalable Computing For Real-Time Big Data Applications (SCRAMBL‘15) at CCGrid2015, the 15th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing, held in Shenzhen, Guangdong, China pdf version here
  20. Parallel Clustering of High-Dimensional Social Media Data Streams. Xiaoming Gao, Emilio Ferrara, Judy Qiu Presented at CCGrid2015, the 15th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing, held in Shenzhen, Guangdong, China pdf version here

University of Kansas

  1. John Paden, Theresa Stumpf, Wideband DOA Estimation for Ice Sheet Bed Mapping, Paper#: 101, 2016 IEEE International Symposium on Phased Array Systems & Technology, 18-21 October 2016 Massachusetts, USA.
  2. Radiostratigraphy and age structure of the Greenland Ice Sheet. MacGregor, J.A. M.A. Fahnestock, G.A. Catania, J.D. Paden, S. Gogineni, S.C. Rybarski, S.K. Young, A.N. Mabrey, B.M. Wagman and M. Morlighem Journal of Geophysical Research Earth Surface, Jan 2015, 2014JF003215. pdf version here
  3. Bed Topography of Jakobshavn and Byrd Glaciers In Big Data and Extreme-scale Computing (BDEC), S. Gogineni, J.-B. Yan, J. Paden, C. Leuschen, J. Li, F. Rodriguez-Morales, D. Braaten, K. Purdon, Z. Wang, W. Liu, and J. Gauch Journal of Glaciology, vol. 60, no. 223, pp. 813-833, Nov 2014. pdf version here
  4. Radar Mapping of Isunguata Sermia Glacier, Greenland, In Big Data and Extreme-scale Computing (BDEC), Ken Jezek, Xiaoqing Wu, John Paden, Carl Leuschen IGS Journal of Glaciology, vol. 59, no. 218, 1135-1146, 2013.
  5. Layer-finding in Radar Echograms using Probabilistic Graphical Models 2012 International Conference on Pattern Recognition, David J. Crandall, Geoffrey C. Fox, John D. Paden Journal of Glaciology, vol. 60, no. 223, pp. 813-833, Nov 2014. pdf version here
  6. High-Altitude Radar Measurements of Ice Thickness over the Antarctic and Greenland Ice Sheets as a part of Operation Ice Bridge, Jilu Li, John Paden, Carl Leuschen, Fernando Rodriguez-Morales, Richard Hale, Emily Arnold, Reid Crowe, Daniel Gomez-Garcia, and Sivaprasad Gogineni, IEEE Transactions on Geoscience and Remote Sensing, 2012, vol.50, no. 12, doi: 10.1109/TGRS.2012.2203822. pdf version here
  7. Ice-sheet bed 3-D tomography, John Paden, Torry Akins, David Dunson, Christopher Allen, Sivaprasad Gogineni Journal of Glaciology, 56 (195), 3-11. pdf version here
  8. Comparison of tomographic methods for ice bottom mapping,, John D. Paden, Sahana Raghunandan, Shannon Blunt, Carl Leuschen International Glaciological Society Radioglaciology 2013 Meeting, Sept 9-13, entry 67A084. Information available here
  9. 3D Imaging of Ice Sheets, John Paden, Christopher Allen, Prasad Gogineni IEEE Geoscience and Remote Sensing Symposium, 2010 (IGARSS ’10), Honolulu, Hawaii, 25-30 July, 2010. pdf version here

Arizona State University

  1. MDAnalysis: A toolkit for the analysis of molecular dynamics simulations. N. Michaud-Agrawal, E. J. Denning, T. B. Woolf, and O. Beckstein. J Comp Chem, 32:2319–2327, 2011. doi: 10.1002/jcc.21787. More information here

Virginia Polytechnic Institute and State University

  1. Alam M, Khan M, Vullikanti A, Marathe M (2016) An Efficient and Scalable Algorithmic Method for Generating Large–Scale Random Graphs. In Proceedings of the The International Conference for High Performance Computing, Networking, Storage and Analysis. Salt Lake City, UT, November 13-18, 2016
  2. Arifuzzaman S, Khan M, Marathe M (2015) A Space-efficient Parallel Algorithm for Counting Exact Triangles in Massive Networks. In Proceedings of the 17th IEEE International Conference on High Performance Computing and Communications. New York City, NY, August 24-26
  3. Parallel Algorithms for Generating Random Networks with Given Degree Sequences. Maksudul Alam and Maleq Khan. 12th IFIP International Conference on Network and Parallel Computing (NPC), New York City, Sep. 2015. pdf version here.
  4. A Space-efficient Parallel Algorithm for Counting Exact Triangles in Massive Networks. Shaikh Arifuzzaman, Maleq Khan and Madhav Marathe. 17th IEEE International Conference on High Performance Computing and Communications (HPCC), New York City, Aug. 2015. pdf version here.
  5. Fast Parallel Conversion of Edge List to Adjacency List for Large-Scale Graphs. Shaikh Arifuzzaman and Maleq Khan. 23rd High Performance Computing Symposium (HPC), Alexandria, VA, USA, April 2015. pdf version here.
  6. Distributed Memory Parallel Algorithms for Massive Graphs. Maksudul Alam, Shaikh Arifuzzaman, Hasanuzzaman Bhuiyan, Maleq Khan, V.S. Anil Kumar, and Madhav Marathe. Parallel Graph Algorithms, CRC Press / Taylor & Francis, 2015 Ed. David Bader.