Log April 16
We will try to rerun the jobs that failed last weekend. This will be something like 100 jobs running through the weekend. Tony will inform people when data should start to show up.We will give back to IT about 100 CPUs today as we do not anticipate requiring them for DC04 anymore. Tony believes we can get about 10M more events through the system by the end of April using 2-300 CPU's. This data is at CERN and can hopefully be built into runnable datasets in the next days.
(This would mean we eventually get about 15M events fully through the system in the DC04 period, about 1/3 of what we originally planned, but not too bad an achievement)
We will also return some of our disk buffers. For starters we can return one from the gdb. We ask the SRM and SRB EB teams to propose a route for us to liberate 1/4 disk servers from each. Clearly this will require some thought to achieve this. The ball is in the EB managers court to tell us when this can be done.
CNAF reported a Castor stager problem that they are recovering from but may impact the start up of transfers to CNAF
Nicola reported that some 200 analysis jobs ran last night. We look forward to a report on them. When the CNAF Castor problems are solved we would like to see a timestamp analysis showing the realtime analysis performance. Tony points out that a key feature to get the analysis turnround optimized will be to optimize the export buffer file selection algorithms; to move away from random selections to ones that ensure that datasets (or filesets required by an analysis job) get completed more quickly.
Lassi reported that a new COBRA/ORCA will be available (Stephan notes it is available now) There are no ORCA changes but some COBRA improvements to allow location independent virgin metadata catalogs.
Log April 13
As reported in mails today T0 running over the Easter period went on the whole well. The transfers via the EB-SE and EB-SRM also went well. The situation with the SRB sites is not clear at the moment.
Agents everywhere functioned rather well except when the machines they were running on got rebooted.
The T0 production is more or less stopped at the moment. They will recover the jobs that failed and get them through the system.
The emphasis for the next "2" weeks will be on demonstrating and quantifying the realtime analysis turnaround. To this end we will plan to run modest scale productions to keep the system running and feed the realtime analysis tests, at the same time completing the reconstruction of the available digitized datasets. A major goal of this is to also improve the processes and tools by which the compilation of 'datasets' at each stage is managed.
The medium term (3 weeks) goal is to put in place better systems with which we can run semi-continuous productions at a number of T1/T2 sites to get the reconstructed data into the physicists hands as quickly as possible.
Log April 1


The EB-SE agents and T0 production seem to be suffering from some castor related problem. Files are being requested from the gdb that are no longer "on disk" and suffer some castor staging errors (tape unavailable), but it is not at all obvious why any files would not still be on the gdb. this is being investigated.
The transfer agents were otherwise running well, the SE ones were keeping up , Ian proposes to change the number of // transfers on the SRM-EB to match more closely the Se performance. The SRB transfers have a larger backlog to catch up. RAL was keeping up with new files, but a bug was found such that Lyon and FZK were trying to get their copies out of the RAl tape system (behind a firewall!) instead of from the gdb. This has now ben fixed.
We do not currently understand why the rate of jobs finishing fluctuates so wildly. Werner noted that the batch nodes are in general all behaving very well. 90=95% CPU utilization, no swapping or paging except at job startup. See the attached CPU and network plot
Lassi, Pablo et al plan some interventions on the TMDB this evening so they profit from production draining due to possible castor problems to make their changes now.
We agreed that all DC04 activities will stop on the evening of April 8 and only be restarted on April 13 allowing everyone a genuine Easter break.
We should plan soon the special condition tests that can stress individual parts of the system. Proposals should be circulated before the Friday DC04 meeting (16h). Examples might be streaming also digi files to stress WAN etc; Throttling the gdb publish step to allow it to be increased to simulate higher rates of jobs finishing. ...
Log Day 30
After some hiccups overnight, production is back up with 500 jobs running. the Muons are now being reconstructed!
Overnight when the config agent came back up there were 110k files to work through. As aresult of the way the agent ran nothing happened for avery long time... Owen has modified to just work on the first 300 files if it finds more than 300 rather than try to manage such an enormous file list.
lxgate04 is running very slowly. The processes are doing nothing, but there is enormous swapping presumed to be an internal kernel problem. it will be rebooted
CNAF reports successfully running analysis via the UI/RB on the CNAF CE accessing the DST files from the CNAF SE. Congratulations!
To do this they followed the production methods to get the metadata from RefDB rather than through the RLS. We look forward to seeing this happen through the RLS.
We identified two "analysis" use cases:
1) The ability to on the fly analyze files (or groups of files that need to be present) on the T1 as they appear.
(Stephan has circulated a prescription to do this with the use of the initial virgin meta-data files registered in the RLS/TMDB and transferred once and for all to the T1, followed by use of the findcolls command to identify the files that need to be read by the job.)
2) The ability to give users larger chunks of data (PARTIAL or FULL datasets) for which all such metadata operations have been completed at the T1 and they can process the collection without further meta-data manipulations
Our main (new) goal now is to demonstrate the satisfaction of these two use cases. (In addition to demonstrating the ability to keep the T0 and T1 traffic up at the required rate.)
Log Day 26
Production ran last night. A few hundred jobs ran with with 85% success.
The failures were probably due to a mis-configured RFIO, not to ORCA per se. HLT code is still leaking a bit and is still running even when Stephan tries to turn it off, however the leak is not so bad and the memory footprint seems to be well under 500M. So we will run with HLT turned on and store its output.
So this looks like a useable ORCA version (Congratulations ORCA team!)
and we can proceed to full DC04 production this weekend. The goal will be to keep the 500 nodes busy all weekend if we can and to have all agents running for export to the Tier-1s.
Staging capacity on the detector pool is increased to 6TB. Thanks IT!. Prestaging software is in place so we should avoid castor waits.
Last night one SRB server stopped losing its partition table. Bizarre unheard of error. Not understood
The SRM head node has heavy load. Ian is investigating.
Lassi circulated a list of priority items for RLS. He requests feedback by Monday so we can specify to the POOl and RLS teams our priority list. We especially need to address the likely use cases of LCG based analysis where it may load the RLS in new ways.
Lassi has been publishing also to the cert01 rls. This may or may not be the one that is supposed to being synched to CNAF?!!?. Please CNAF and Lassi get together to make sure we are exercising the system so we can get it cleared for production use as soon as possible.
Log Day 16
Day 16 Log. March 16
Stephan reported that ORCA_8_0_0 is "out". All PRS groups have useful objects in the DST. The EGamma calibration and Muon DT calibration streams are in working!
the HLT code is in but has some problems so is datacarded out of operation at the present. The POOL 161 release is expected imminently to address some DSt reading problems. Stephan needs to provide Julia with a anew orcarc file.
Julia. waiting for the cards then will run ORCA_8_0_0 overnight.
RLS publisher agent was broken and not registering the POOL metadata
script to correct that is ready (18k files) the agent has been fixed.
Stil to do: COBRA meta files zipped up and sent.
The timing of RLS insertions is on average 6 seconds, with a few events at 90s. There was a series of step improvements that seemed to be associated with the change of Application Server on Sunday morning that are not at all understood.
The failures of the RLS to register some files over the weekend are not understood.
Of RLS updates, 6% failed. On resubmission finally 1.3% required by-hand interventions
Some edg experts have recommended that we do bulk-updates. However after the meeting it has become apparent that the present LCg middleware does not support this option, not would it register the meta-data
The cleanup tools in the export buffers are ready to run. once thats in the cleanup tools in the gdb will be implemented.
Julia will look into sending the assignment number with the virgin metafiles or some php script to get it from refdb. This is needed by the file attachment procedure that will need to run at the T1 and is a bit painful to get at present.
We believe the ECAl calibration streams should stay at CERN - are their otehr centers who would like them? We assume the Muon DT stream will be sent to CNAF; again, any other takers?
Next meeting will be at 15:00-18:00 Wednesday and will be a more formal* meeting with written status reports to bring the wider community up to date ad to give david a source of slides for Thursday morning.
* black tie, evening dress for the ladies
Log Day 15
Day15 Resource Usage PlotMarch 15, 2004
The DC04 Daily Meeting overcame many obstacles this afternoon, but successfully happened after a fashion. There was a confusion about the actual time of the meeting and VRVS location of the meeting. The old schedule indicated the meeting was at 3PM, though the minutes from Friday said 4PM. For some reason there was a CCS DC04 meeting and a CMS DC04 meeting booked in VRVS. Even for those who connected in the correct room, the VRVS setup at CERN was not working well. We were unable to hear anyone and the update rate on the chat window seemed slower than usual. Apologies to all remote participants and everyone confused about the time.
Stephan's normal motion to cancel the challenge was proposed more quietly than usual today. This indicates a more optimistic outlook from Stephan or resignation.
Reconstruction:
Reconstruction ran smoothly over the weekend with few failures. The estimated data rate was 4Hz.
The new version of the reconstruction code will hopefully be released in the next 24 hours. It will either be identified as ORCA_7_8_0 or ORCA_8_0_0. The developers have been working through a memory leak. The HLT code is not in a sufficiently good state to run in the challenge. Useful calibration streams are being identified. The current candidate streams are ECAL and Muon Drift Tubes.
Transfers:
There was a concern expressed about the size of the files being replicated in DC04. The DSTs and Streams are not large and this leads to small files because runs from several DSTs cannot currently be merged. Stephan suggested merging the HLT and CaloRecHit information into the DST files. This will not significantly increase the size of the DST files, but it will reduce the number of very small files by 50%.
There were problems with the RLS over the weekend. The exact cause of the problem is unknown. Queries and updates to the RLS were not always answered during the period the RLS was having problems.
Reports from the Tier-1 centers were generally good
CNAF had a productive weekend and has started successfully duplicating data to Legnaro. The Castor name length issue problem continues to be an issue.
IN2P3 recovered from some hardware failures over the weekend to transfer the complete file set to tape.
PIC successfully transferred files. During the transfers the administrators at PIC uncovered a configuration error in their Castor installation. The migration to tape is faster since the fix. The PIC developers switched to the C++ API to the RLS, which reduces the time to publish by nearly a factor of 10 (4-5 seconds goes to 0.5s). PIC is investigating the C++ interface for more operations. At the moment they are CPU limited by the Java processes of the other interface.
FNAL recovered from some local network problems which caused dCache pools to lose contact. After restarting dCache, the agents were able to catch up with the transfers.
There were no representatives from GridKa or RAL, but according to the TMDB the weekend was successful. GridKa sent a mail indicating that without an automatic interface to mass storage they have no way of bringing data on and off tape for remote users transparently. They are currently using SRB for transfer, which has not been integrated into their mass storage system.
Analysis:
Tony and Julia provided a prescription to FNAL for attaching runs from arriving data. It requires the Tier-1 center know the assignment ID from the incoming data, which involves using the names to look in the RefDB page and get the assignment number. This is recognized as an issue, but there are currently no solutions proposed.
Action Items for the next 24 hours:
Stephan will provide the new reconstruction code and the new data cards to merge the small files into the DST.
Lassi will modify the configuration agent to allocate digi samples to some Tier-1 centers. This will allow the transfer agents to measure transfer rates with the larger files and verify the configuration agent is working to divide data between centers.
An RLS expert will hopefully circulate a theory to explain the problems the RLS had over the weekend.
Lassi will bundle the META data files from CERN to allow local analysis with prepared META data files
Fermilab will try the attach run prescription for generating META data
Responsible people should verify that their transfer components are being monitored by at least one system. It is preferable to have redundant monitoring, but everyone should be monitored by something.
Tomorrow's Meeting will take place at 5PM MET.
Log Day 12
Day12 Resource Usage PlotStephan reported some problems in reading the new DST that have been reported to POOL/ROOT. A work-round will be in place for Monday. Otherwise the DST is much improved over the 7_7_0 version in usefullnes for the PRS. Congratulations to the ORCA coders who have been working very hard! HLT objects will be available and it looks like the Electron calibration stream will be ready for the Monday release.
Production and T0/T1 teams will aim at keeping the full system running over the weekend. Goal is long term steady functionality tests rather than pushing the scale to far.
Dirk reported on an almost stable set of monitoring information for the RLS DB and Application Servers. Will be made available via http for interfacing to ML
Ian reported they are ready to have GridICE running on the server nodes at FNAL but are awaiting the GridICE server to be back up. Sergio has been contacted.
Production will be disipatching the metadata files for the "validation samples". FNAL will work to make the samples useable this afternoon (FNAL time)
DC04 meetings will take place all next week (cmsweek)
Monday 16:00
Tuesday 17:00
Wednesday. 15:00
Thursday 15:00
Friday 15:00
Note that the Wednesdays meeting will be a formal summary with presentations etc.
Log Day 11
Day11 Resource Usage PlotDC04 Daily Log Day 11
March 11, 2004
Stephan reported very busy people working on the ORCA release. Hope for a stable releasable version Friday
Production has been assembling more datasets. 5m published now few M more coming online.
All T1s managed to transfer data using the production tmdb. Lassi's agents running continuously, config agent running continuously. T1
agents up and running.. AFS token problem cured with acrontab. Publishing failed this morning to Bristol and is being investigated. INFN were
the first to successfulll transfer production data.
Tony suggested to Lassi afterward the meeting that he try taking one or more of the published catalogues and writing something that just injects Digis
from it into the T0-dropbox every few minutes, so there's a constant heartbeat of traffic.
Aim for distribution to be stable this weekend for continual exercising. Production will aim to keep a steady stream of files coming
into the gdb and drop box for distribution
SE export buffer now available. Thanks to IT/FIO and EIS groups. The machine needs to go onto HTAR as we are saturating the firewall. Once
stable operation is demonstarted we will ask for the upgrade from 1 to 4TB
RAL may need a kernel upgrade on some SRB machines tomorrow.
(The problem of asynchronous and continuous security upgrades is proving to be a serious one that will have to be addressed in
establishing a stable computing model)
PIC reported 4-5 MB/s limited by CPU in the (local)UI. They will upgrade the hardware tomorrow to try to improve the situation
FNAL. mysql pool catalog was populated ok, but problem was that no metadata came with the validation samples. Not yet ready to run the
attachrun procedure at each center. (For a large data set this can take hours) Production will distribute the metadata files, but they
cannot go through the dc04 path as they cannot be registered in the RLS as it does not allow file versioning. (This is a significant missing
feature for the future...) Tony will consult with Julia and send a mail later this evening about the initial use of metadata. For the real-time
analysis the production team will work on making the attachrun tools robust with a timescale for later next week.
Getting the data sample useable at at least one T1 is on the critical path now.
Validation samples will go back into the production tmdb this evening for redistribution
Note: tomorrow's meeting has been moved to 3PM MET
Log Day 10
Day10 Resource Usage PlotDC04 Day 10 log March 10, 2004
Reconstruction:
Currently reconstruction is not running except for test samples.
Test sample with writeStreams executable was ran over night, everything
worked smothly, RefDB got correctly updated with information sent by
writeStreams jobs
Tony is working on publishing of samples produced and got transferred to
CERN. Nikolay is making information about published data available at the
production page. Some work is required for publishing scripts to make
them working for reconstruction samples, Julia is working on it.
The infrastructure for putting items in the GDB agent drop box is
automatic from reconstruction jobs.
Transfers:
Tim released a production version of the TMDB and the old version is
now available for testing and development. The web page has been
updated to allow the browsing of both.
Lassi's agent to populate the RLS is running automatically. During
Tim's retry of the validation samples some sets of files were not
successfully entered with FCpulbish. The working theory is that since
the XML entries from the reconstruction jobs are fragments and the XML
entries from the validation samples are complete, the agent may not be
able to handle both formats equally well. Lassi and Tim will
investigate.
The configuration agent which assigns datasets to regional centers is
running. Currently there are no streams so the agent defaults to
sending all data to all centers.
Tier1 Transfer Status:
Sites relying on SRB were stopped due to the down time of the MCAT.
The SRB export buffer agent was able to successfully test importing
files into the buffer before the catalog was lost. IN2P3's agent is
ready for production use. GridKa believes that their agent is OK,
but will perform validation testing this evening. RAL will be able
to test now that the catalog is back.
Sites relying on the classical storage element, PIC and INFN, are
experiencing a strange issue with files being declared 'CLEAN' even
though they are still in the buffer. Investigation is needed.
FNAL, which uses the SRM export buffer, enabled the clean up agent.
This agent checks that files are marked as safe and removes them from
the export buffer. After some help from the e-mail list Yujun was
able to populate the Pool mySQL catalog with the meta data. He is
adjusting the PFN for the FNAL site and expects a catalog to test
against by the end of the Fermilab day.
Analysis:
Stephan Wynhoff asked for a delay of the release of ORCA_7_7_8 to
Wednesday or Thursday of next week. The lack of available data to
test against, at CERN and elsewhere, is delaying the development. The
data is expected at CERN this evening.
Monitoring:
MonaLisa monitoring is becoming more complete. Most of the export
buffer systems are now reporting.
Ian attempted to install GridICE on the SRM export buffer systems.
The repository of the software at INFN is currently offline for an
intervention. Sergio will post to the list when it is available
again.
Log Day 9
Day9 Resource Usage PlotDC04 Day 9 log March 9, 2004
DST reading software has been improving. It is now possible to read the DSt without loading Digis, Hits, MC. Well done ORCA crew! The 7_8_x release is scheduled for Monday 15, but Stephan is aiming to have it stable some days before that. Stephan has distributed a tag with a fakeAnalysis example that T1's should be able to use to start to exercise their local installations of data and catalogs.
The production team is ready to run the writeStreams executable in production now and will do that probably today.
The data transfer teams have been making steady progress. Full chain transfers have been run to all centers now, though a number of (soluble) problems are being worked on.
Tim is setting up a production TMDB system tonight to mount a "stable" service and allow development to continue on a separate TMDB. (setting purge flags for the files in the gdb will become an issue sometime, but for the moment the system is only 10% full)
The registration of files in the RLS was a bottleneck with the POOL tools, but is apparently much faster with the standard edg tools. However querying the RLS for metadata still looks to be a serious performance problem for even modest numbers of files. Alessandra et al will investigate and come with their recommendations in case we are not using the RLS in the foreseen way. If the RLS is queried for the POOL meta-data for each file to build local catalogs at the T1, even 5s/file can become impossible. The time is associated to the start/stop of the java vm, not the query as such. If this is a correct statement of the problem, then we can envisage other ways to access the RLS (direct SQL queries) or distribution of the POOL meta-data with the files.
The CASTOR at CNAF seems to have a shorter maximum filename length than at PIC or CERN for unknown reasons. this is preventing them storing files with their correct full names. CNAF/PIC CASTOR experts are working on this. (PIC does have the chain working fully through to MSS)
The CERN CASTOR system will go down for three hours from 11MET Wednesday. All gridftp etc activities will fail and should be halted beforehand.
SRB system went down at RAL due to "external technical reasons", will be back tomorrow about he same time as CASTOR goes down...
Resource Monitoring and Utilization Plot for Day 9:
Log Day 8
Day8 Resource Usage Plot
Day8 Network Plot
Log of DC04 Day 8
March 8, 2004
Reconstruction:
Lots of reconstruction jobs ran over the weekend. We are now approaching half a million reconstructed events and the ramp is good.
Transfer Management and Data Flow:
The discussion of information flow between reconstruction processes and Tier-0 agents from today condensed to the follow (from Lassi's summary mail)
Reconstruction jobs will place XML catalog fragment and checksum
file in a designated inbox on lxgate04. The publishing agents
will add POOL XML header/footer to the fragment to a POOL
catalog fragment. The catalog fragment contains fake PFN
identical with LFN. The publishing agents replace the PFN
with real castor path with "stageqry" before registering the
information into RLS or TMDB. TMDB contains the
full castor path without the "rfio:" prefix. Any other information
needs to be looked up in RLS.. The checksum file contains the
LFN, a bare file name without any directory component
(no leading "./"), in"cksum" output format. The LFN in the catalog fragment is identical, no directory component.
The files are placed in a new(unique) inbox subdirectory:
Some.directory.name/
XMLCatFragment.Some.name.xml
CheckSum.Some.name.xml
RLS and TMDB publishers make this information available in
relevant databases unchanged. That is, the data content (file name
strings, checksum, guids, metadata attached to files in the catalog
fragment) is not modified or filtered in any way. Only the outer
"wrapper" is modified so it is acceptable to the tools we use.
On Saturday functionality validation samples were replicated to Tier-1 centers by CNAF and FNAL. 293 files were placed in the TMDB by Tim, the files were replicated to Export Buffers and then to Tier-1 centers. The RLS catalogs were updated at each step. Yujun Wu at FNAL noticed the RLS replica update using the POOL tools takes approximately 5 seconds be event. This is of the same order as the transfer time for the files and represents a significant overhead. Update times of 7-8 seconds were observed at PIC. Alessandra recommended switching to the EDG tools, which reduces two steps to one.
edg-lrc --no-validation addMapping --endpoint $CERN_LRC --vo=cms $GUID $PFN
FNAL will evaluate the performance change by switching to the EDG tool. FNAL will also move the RLS update to a separate script, which runs asynchronously to avoid slowing the transfers with the RLS update.
6000 files from the PRS validation samples are expected in the TMDB by the end of the day at CERN.
Analysis:
For the analysis steps. Reconstruction will run the DSTs for the validation events being shipped to Tier-1 centers. The DST can then by shipped to Tier-1s for fake analysis The current version of ORCA requires Digi files to be available. The next version of ORCA, which will be available on March 15th, is better suited for analysis, but the current version can be used for testing. It is hoped that by Wednesday the DSTs from the validation samples will be transferred to Tier-1 centers. Stephan has made an ExDSTStatistics example available as a first DST program only for technical testing.
Until the new version of ORCA is available the digi files are required. In order to ship the digi files with the DST files, the reconstruction jobs must include them in the XML fragment given to the Tier-0 agent.
Tomorrow's meeting is at 4PM MET.
Log Day 5
DC04 Day5
March 5 2003
No production run today, but jobs will go in again tonight to populate the gdb with new event files and to dispatch the catalog-fragment and cksum files to the dropbox on lxgate. This should give a realistic looking set of data for the T1 replicators to work on. Multi-Stream outputs are being tested in production and will be ready soon. Once the replicators have got used to the standard DST we will run some jobs with the streams turned on to give them more work to do.
The changes to the oracle rls (rlscms) have been made, so that the string length problems have been solved for the time being. POOL_1_6_0 is in final stages of pre-release, we can hope for it today, but maybe not till Monday.
The TMDB tests have been getting underway uncovering the various glitches. Some problems associated with using the rlstst that has some entries that we are replicating. Switch now to the production rlscms.
The LCG2 Export Buffer has some problems with a kernel incompatibility with the gridftp requirements. We make contact with IT/FIO to resolve these problems
The replica manager with data channel authentication removed is currently in testing. Release expected from LCG later today or Monday. This is the last known issue for LCG-2 compatibility with the SRM-dCache Storage Element (always the possibility for unknown issues).
The CREN/CNAF Oracle RLS synchronization is back on track, we expect to be in a mirrored production service on Tuesday with the goal of attaching the synchronization to the production RLS later in the week
We start to think about the analysis phases. "Local" analyses at the T1 centers are being prepared. It is very important to extend the testing to really using the full LCG2 possibilities. Since all files of the DST will be at many sites we would like to demonstrate the ability to run analysis jobs over a large sample of the DST, spreading the jobs via the RB over many sites and worker nodes. This looks to be doable with CNAF,PIC and Legnaro. Running at RAL also (maybe same true of GridKA/IN2P3??) will require making the data in their local SRB visible to the LCG2 SE. Running at FNAl in LCg2 will require the RM changes noted above.
A tremendously successful five days, thanks to everyone!
Next Meeting Monday March 8th at 3PM MET
Log Day 4:
Day4 Resource Usage PlotJulia reports that we ran more than 600 (1000 event) jobs with zero failure. The outputs went to the gdb. There was a problem getting the
summaries into refdb, but it was expected and can be fixed. The refDB can be updated a posteriori. So far we are testing with one stream only.
The catalog fragments and checksums have been coded, but they will be transferred to the GDB and a special data volume when available.
They will be recorded in the RLS and in RefDB. The catalog fragment files will not be distributed to the T1's, they will get them from the RLS.
Timing reports have been circulated, depending on event complexity they are in the range 16-60 seconds (Note that at 20 seconds wall time per
event we would be able to reach 25Hz in the assigned 250 nodes - there is the possibility of using more nodes to actually reach the target
rate)
Memory footprints are at about 500MB. Stephan and the PRS coders are chasing down leak sources. The numbers from Julia's mail were
from LSF which sums all threads and leads to a factor of 4 overcounting.
Lassi has most of the logic of the T0 scripts written, ( spying on the gdb, registering data to the RLS and to the TMDB) The commit to the
RLS has many unpleasant failure modes due to atomicity short-comings. They can't be solved in the DC04 time-scale. Scripts should be
completed today and in testing tomorrow.
Tim has planned an in depth T1 agent test tomorrow. The validation events will be in the gdb tomorrow and the TMDB will be updated from
11am Friday. The T1 centers should have their agents running before that so we can see the transfers to the Export Buffers and to the T1's
in action.
Stephan reported progress towards 7_8_0. The remaining problems requiring access to digis in readDST will be corrected. Goal is to have
a solid 7_8_x release by next Friday.
The Oracle RLS middleware upgrade v2.2.5 containing all recently discussed bug fixes will be deployed on rlstest, rlscert01, rlscert02
Friday - time to be confirmedDirk has been contacted to confirm POOL1.6.0 release schedule.
CMS wants the CERN/CNAF Oracle RLS synchronization to be running by the end of next week. Possible validation route is to test with rlstest
and CNAF early next week (ie all CMS RLS entries go to production and to test service)
Current picture for next weeks work, to be tuned in Friday meeting:
- Keep jobs running through the weekend, but not 24x7 intervention to keep the gdb, rls, tmdb, eb, t1 system being exercised, debugged. (But
people should plan not to work all weekend, next weekend will be heavy enough...)
- Early next week. T1's demonstrate ability to build local catalogs and get readDST (even toy) jobs running on the data as it comes in. (I
realize not all T1s are ready for this, please inform me if you will be trying to reach this goal) During next week, as the local catalogs
become useable we will need analysis tasks to start to run
- Target is that from next weekend we try to have the system fully running, with ORCA7_8_x, processing of order 1M events (multiple times
if neccesary) to exercise the full chain.
Next Meeting tomorrow at 4PM MET
Log Day 3:
Day3 Resource Usage PlotJobs were able to be submitted last night. The memory is currently at 450MB per job. Processing was OK until around 11PM when we
suffered 15 BUS error jobs. Unclear at the time if this was at a thread error or an NFS problem or something else. Subsequent forensics
suggest this was a one-off configuration management problem, and indeed many tens of jobs appear to have succeeded since (Tony
has seen lots of *L1Trigger* files in the gdb pool). There are lots of jobs running now, 480, with 154 pending inthe queue.
Werner brought up the concern that we have not tested all our nodes yet and requested a large submission to validate the nodes.
Next step is to ship data to Tier-1 facilities. There was confusion about the push or pull model from the General Distribution
Buffer (GDB) to the Export Buffers (EB). There was also a confusion about the number of elements currently with prototypes. The transfer
agent group is encouraged to use the mail list for communications (We have generally used directed e-mail). Issues were clarified in the
meeting and Lassi is filling the missing pieces.
The 'PRS Validation samples' for DST production are now available on the GDB pool and will be published as part of the test of the agents.
Missing pieces of the transfer infrastructure are mainly at the TIer-0 and are being implemented by Lassi.
Current plan for this part of the data flow, from Lassi and Tim.
[A] The event data files are stored in the GDB (castor pool) with rfio,the other files are copied into the RLS publishing agent's
"drop box". More specifically a directory. (Cf. how mailer delivery daemons work such that they will never lose any
data, e.g. "man postfix".) This affects the discussion yesterday about where the RLS is updated. The current plan is the have
the GDB agent handle this.
[B] The agent scans the drop box and finds new files in there (or a new subdirectory for the files in question). Hence
this operates 1-to-1 with files appearing from the batch farm. No queries required.
[C] Suitable FCpublish command. Import the whole XML into RLS. If the agent has died, is killed, gets stuck and is
restarted, machine is rebooted, or the moon falls from the sky, you just restart the thing and it immediately continues
processing the "in" drop box where it last left off. If necessary, RLS catalog update will check if the update has
already succeeded fully or in part; in that case the step is bypassed.
[D] RLS publishing agent moves the files (or just the checksum file) to the next drop box on successful RLS catalog
update: TMDB publishing agent. (Again, cf. mail delivery agents.) If successful, files are removed from its own drop box.
[E] Like B, just a different drop box. Will involve creating the initial entry in the TMDB.
[F] Set file retention time to infinite so file won't be migrated to tape. Possibly done directly as part of the
batch job that puts the file on GDB to begin with.
[G] Assign files to T1s and initiate copies to EBs. This functionality is handled by the config agent and the EB copy agents.
[H] Update file status in TMDB, which is handled by the EB agents
[I] This "free agent" notices the file copies are complete (observing state changes in TMDB?) and changes the file
retention time in castor to non-infinite. Currently handled by configuration agent.
The meeting finished with a discussion on the recently allocated classical storage element. The machine provided by CERN has the
normal 10 mirrored partitions. Andrea reported that this is a problem for the classical storage element because the classical SE is
functionally a single entity. The amount used is reported to the information provider on a partition basis. There is not an obvious
solution to this. Ian commented that the SRM-dCache SE has a compatible information provider. The disk usage is reported as the
sum of all the usage for all the systems over all the partitions which make up the storage element. Hopefully issues with access from the
replica manager will be solved before the end of the week allowing us to access the SRM-dCache EB from the RM.
Log Day 2:
Monday night Julia got refDB and MCRunJob modified for reconstruction jobs and managed to run ~100 jobs (actually with the same input file by mistake). Some problems showed up that she and Stephan are chasing down. Expect to try again with a new set of jobs shortly.
Stephan reported the problems with writeDST jobs and the L1 trigger are solved, no longer having any dependence on hits,assoc or digis. Still to check if problems solved for readDST. Expect to release this in 7_7_1_pre1 shortly.
Next goal of production (when they can get jobs completed!) is to get the files into the GDB (The buffer, not the board). The files and metadata must be registered in the RLS and the files to export in the Transport DataBase. There was considerable discussion as to when this happens, in particular at the end of the batch job or in a post reco job step. We agreed that the initial trials will be to factorize the reco step from the dat registration step. Tony/Julia get the output files and xml fragments into the GDB. Lassi writes agents to spy on the arrival of files and register them to RLS/TDB. We may decide to register files as they appear, or as all the files for the run appear, We can investigate later the effect of running this in the batch jobs. (David certainly thinks it is a mistake to couple to batch jobs to the DB update and as he is writing the minutes..) Goal is to have this running in at least a primitive form tomorrow (Wednesday) so T1's can start to pull files out of the GDB through the export buffers...
The Oracle RLS string buffer limitations will be fixed in a Thursday morning reboot of the service.
The SRB export buffer is working with apparently good performance
The SRM/Dcache buffer is working at 20-30MB/s. The "Data Channel Authentication" option is expected to be solved in the new RM tools thus possibly allowing this SRM and the LCG SRM to co-exist on the same buffer.
Andrea has been ill, but will check on the status of the LCG Classical SE
Iosif reminded us that he can make more finely grained information available if for example people identify particular job types that should be monitored. He comments that we should ask for the SNMP version on the cern dataservers to be upgraded (Werner please..) He will add disk IO montioring
Log Day 1
Monday March-1, 15:00 Status meeting.
L1 trigger persistency problem can be fixed. Probable solution for dependence on simhits has been found and needs to be tested. test without SimHits/MCInfo when reading DST succeded, now trying writeDST. Tracker working on fix to dependence on Digis when reading DST
We should expect a 7_7_1 release in the next 24-36 hours.
The string length problem for metadata (Basically the list of reconstructors of which there will be about 50-100 in dc04) has been fixed in POOL. The fix is being tested to be part of release POOL_1.6.0. We ask POOL team to release this as quickly as they safely can (sometime this week is fine). We would have to build COBRA and ORCA against this release and there have been some interface changes The limit would then go from 256 to 65k chars.
The Oracle RLS needs to be fixed too and this can be done in the next day or so. There was a suggestion from Alessandra that this could still limit us at 4k chars? We need to check this with RLS team.
(Stephan confirms that 4k would not be a disaster for DC04)
The POOL MYSQL catalog needs the fix Zhen developed for it deployed for DC04.
MYSQL as an RLS backend is not on the critical path. We hope it can be done soon nevertheless, but as part of a normal release sequence.
Production team is close to submitting jobs in the nextfew hours. their goal is to get some 7_7_0 jobs through the system, see the failure modes, get some data (clearly not useful data but files of appropriate sizes and types) into files and registered to the Transfer DB and let the regional cenetrs exercise the transfer etc.
