0. We have built up an extensive framework surrounding the dCache that mitigate problems and allow the system to work reliably. I'm not interested anymore in eliminating this 'scaffolding' since it is working fine. Further the integrated packages we do get are often difficult for us to automate the error interactions with - a good example is the automated crc pool scanning, it works fine but we would have to write another script to process errors. I'm not trying to complain here, just saying that we are satisfied with our framework. The rest of these items are mostly issues where we can not script around the problem and need fixes. 1. Replica manager - often shows pools offline, when in fact they are online in the rest of the system and transferring data (dccp and gftp) 2. Replica manager interactive ssh thread is often non-responsive 3. Requests for files in pools that are offline correctly go to tape. When a pool comes back online, files need to be delivered to user from the pool, and not from the tape request. 4. The Pin Manager needs to accept a file that contains the list of pnfsids to pin so it can work on it at its own pace. Otherwise, doing this interactively 1-by-1 induces timeouts and ignores pin requests. This is similarly needed for unpinning 5. SRM failures, especially 3rd party transfers, are still a black box and difficult to debug 6. If you do "mv /pnfs/dir1 /pnfs/dir2", the tags in the directory are lost, and broken - they can not be recreated. 7. Files on the pools have their access times changed on pool startup 8. catalina.out is unmanageable. 9. When a file is pinned, the srm only transfers the pinned copy - it ignores replicas in the pools 10. When pinned files disappear (don't know why), the pin manager tilts - it acts as if the pinned file were still there causing all kinds of problems. 11. We spend lost of time fixing broken triads - triad = (data, control, SI) 12. Disconnects from clients during data transfers leaves 0 length files in pools - control file says 'from client' 13. Error recovery for individual components is poor or completely missing - components block and no one detects except using special outside mechanisms. 14. Better exception handling - the ultimate scope should be component restart 15. Automated recovery from failure 16. It would be nice to have lru deletions done central and not pool-by-pool.