We have lots (PB+) of data in multiple sources (HPSS, GPFS, etc.), and we are looking a tool to help them manage their data better. We are looking at a software solution called Starfish, but I'm curious to see what other sites have done to solve this problem.
One example of a question that should be able to be answered: "Who in my project is a the top 10% producer and/or consumer of data?"
Hello there! To all of you that are planning or going to SC18 in Dallas, the company I work for is doing a free Lunch & Learn session hosted by Univa and sponsored by RedHat, featuring specialist from AWS and Intel, where they're going to discuss several topics, including containerization via Docker and Singularity. Everyone is welcome! Wed, Nov. 14, at 12 p.m. Details & RSVP at https://www.univa.com/resources/sc18-lunch-learn.php