Discussion:
[dev] ASCII Delimited Text
Adrian Grigore
2018-05-16 14:05:40 UTC
Permalink
What do you guys think of this:

https://ronaldduncan.wordpress.com/2009/10/31/text-file-formats-ascii-delimited-text-not-csv-or-tab-delimited-text/
--
Thanks,
Adi
Martin Tournoij
2018-05-16 14:46:52 UTC
Permalink
Post by Adrian Grigore
https://ronaldduncan.wordpress.com/2009/10/31/text-file-formats-ascii-delimited-text-not-csv-or-tab-delimited-text/
I think it's a reasonable alternative to CSV or TSV. I actually used it
for the file format for a small Vim plugin I wrote a few years ago:
https://github.com/Carpetsmoker/complete_email.vim#file-format

It's not perfect though. It's a bit difficult to type manually <C-v>030
are quite a lot of key strokes.
It also shows up as ^^ in Vim with the default settings, which looks
kind of weird. Some other editors may not display it at all.

Especially if you want non-tech people to deal with it using RS or US is
perhaps not a good idea. They'll open it in Notepad or Word and who
knows in what kind of interesting ways it'll get mangled and broken. In
a perfect world it would deal well with it, but Notepad still can't
handle Unix newlines...
Martin Tournoij
2018-05-16 15:22:37 UTC
Permalink
On Wed, May 16, 2018, at 15:05, Adrian Grigore wrote: In a perfect
world it would deal well with it, but Notepad still can't handle Unix
newlines...
https://hackaday.com/2018/05/08/windows-notepad-now-supports-unix-line-endings/
Oh wow, after 20 years they finally had an intern add this in an afternoon!

But how will it handle other control characters such as RS? I'd be surprised
if that would work well.
Silvan Jegen
2018-05-16 17:39:11 UTC
Permalink
Post by Martin Tournoij
On Wed, May 16, 2018, at 15:05, Adrian Grigore wrote: In a perfect
world it would deal well with it, but Notepad still can't handle Unix
newlines...
https://hackaday.com/2018/05/08/windows-notepad-now-supports-unix-line-endings/
Oh wow, after 20 years they finally had an intern add this in an afternoon!
Bad news:

https://en.wikipedia.org/wiki/Microsoft_Notepad

It has been 30+ years already.


Cheers,

Silvan
Adrian Grigore
2018-05-17 14:50:07 UTC
Permalink
How would you have other tools like cat(1) or ls(1) handle them?
Post by Adrian Grigore
https://ronaldduncan.wordpress.com/2009/10/31/text-file-formats-ascii-delimited-text-not-csv-or-tab-delimited-text/
Seems reasonable to me. For the purpose of transferring data between two different spreadsheet programs that cannot read each other's native formats, seems much better than using tabs or commas.
If I were doing editing of this text format directly, I'd want my editor/pager to have some tools to make it easier. Maybe display the Unit Separator as a special-color pipe and the Record Separator as a special-color newline and the Group Separator as a special-color double newline (i.e. blank line). And an easy way to type in these characters in the editor. For example, Unit Separator can be entered in Emacs as C-x 8 RET 1 f RET, so a better shortcut would be useful, e.g. C-| (control pipe).
--
Thanks,
Adi
Raphaël Proust
2018-05-18 00:51:18 UTC
Permalink
Post by Adrian Grigore
How would you have other tools like cat(1) or ls(1) handle them?
Through the $FS and $RS environment variables.

So you can do `FS=\xHH ls` (where HH is the hex code for the ASCII field
separator). The ls in usul handles that.

-- Raphaël
Greg Reagle
2018-05-17 21:50:43 UTC
Permalink
Post by Adrian Grigore
How would you have other tools like cat(1) or ls(1) handle them?
I don't know. The way it currently handles them perhaps.
Connor Lane Smith
2018-05-21 16:38:24 UTC
Permalink
Hi,
This is a surprise. Where did you get usul from? I'm not sure even I
have a copy any more! The only reason I can think of, though, is that
you need to specify the -L libdir.

I have been wondering lately about resurrecting usul, since 'tabular
munging' with Unix utilities without the ability to set $FS and $RS
can be really unpleasant, and I've had to do quite a bit of it
recently.

I did also write a program to complement usul which performs elastic
tabbing on its input, the idea being that you end up with a nice
tabular view in your terminal. I think ideally you might want 'real'
terminal support for it though, as when printing to the tty you either
need to buffer a lot of data (i.e. until a line with no FS), or you
need to do crazy tty-rewriting ANSI escapes, which it did support but
is a massive hack.

Thanks,
Connor
Adrian Grigore
2018-05-21 16:51:19 UTC
Permalink
Post by Connor Lane Smith
This is a surprise. Where did you get usul from? I'm not sure even I
have a copy any more! The only reason I can think of, though, is that
you need to specify the -L libdir.

Attachment above. :)

cc -lutf -L. -o cat cat.o util.o gives:

cat.o: In function `main':
cat.c:(.text+0x179): undefined reference to `chartorune'
cat.c:(.text+0x1dd): undefined reference to `runetochar'
cc: error: linker command failed with exit code 1 (use -v to see invocation)

I have libutf.a in the current directory.

I can't event compile this:
#include <utf.h>

int
main(int argc, char *argv[])
{
Rune r;
chartorune(&r, "");
return 0;
}

with
cc -L. -lutf -o utf utf.c
getting
/tmp/utf-1e880d.o: In function `main':
utf.c:(.text+0x33): undefined reference to `chartorune'
cc: error: linker command failed with exit code 1 (use -v to see invocation)
Post by Connor Lane Smith
I did also write a program to complement usul which performs elastic
tabbing on its input, the idea being that you end up with a nice
tabular view in your terminal. I think ideally you might want 'real'
terminal support for it though, as when printing to the tty you either
need to buffer a lot of data (i.e. until a line with no FS), or you
need to do crazy tty-rewriting ANSI escapes, which it did support but
is a massive hack.

This sounds interesting.
Post by Connor Lane Smith
Hi,
This is a surprise. Where did you get usul from? I'm not sure even I
have a copy any more! The only reason I can think of, though, is that
you need to specify the -L libdir.
I have been wondering lately about resurrecting usul, since 'tabular
munging' with Unix utilities without the ability to set $FS and $RS
can be really unpleasant, and I've had to do quite a bit of it
recently.
I did also write a program to complement usul which performs elastic
tabbing on its input, the idea being that you end up with a nice
tabular view in your terminal. I think ideally you might want 'real'
terminal support for it though, as when printing to the tty you either
need to buffer a lot of data (i.e. until a line with no FS), or you
need to do crazy tty-rewriting ANSI escapes, which it did support but
is a massive hack.
Thanks,
Connor
--
Thanks,
Adi
Raphaël Proust
2018-05-22 00:24:03 UTC
Permalink
Hello,
Post by Connor Lane Smith
This is a surprise. Where did you get usul from?
I sent the copy. I use usul regularly so I still have the whole repo
locally.
Post by Connor Lane Smith
I have been wondering lately about resurrecting usul, since 'tabular
munging' with Unix utilities without the ability to set $FS and $RS
can be really unpleasant, and I've had to do quite a bit of it
recently.
I did also write a program to complement usul which performs elastic
tabbing on its input, the idea being that you end up with a nice
tabular view in your terminal. I think ideally you might want 'real'
terminal support for it though, as when printing to the tty you either
need to buffer a lot of data (i.e. until a line with no FS), or you
need to do crazy tty-rewriting ANSI escapes, which it did support but
is a massive hack.
In what way is the elastic tabbing different from what Plan9's mc(1)?


Ciao,
-- Raphaël
Adrian Grigore
2018-05-22 13:21:13 UTC
Permalink
Maybe a nice thing to have would be to get the terminal emulator to
treat the field and record separator in a special way. So the programs
all output fs and rs, and the terminal emulator uses these characters to
layout the data in a tabular way.

There's no terminal that does this, right?
Hello,
Post by Connor Lane Smith
This is a surprise. Where did you get usul from?
I sent the copy. I use usul regularly so I still have the whole repo
locally.
Post by Connor Lane Smith
I have been wondering lately about resurrecting usul, since 'tabular
munging' with Unix utilities without the ability to set $FS and $RS
can be really unpleasant, and I've had to do quite a bit of it
recently.
I did also write a program to complement usul which performs elastic
tabbing on its input, the idea being that you end up with a nice
tabular view in your terminal. I think ideally you might want 'real'
terminal support for it though, as when printing to the tty you either
need to buffer a lot of data (i.e. until a line with no FS), or you
need to do crazy tty-rewriting ANSI escapes, which it did support but
is a massive hack.
In what way is the elastic tabbing different from what Plan9's mc(1)?
Ciao,
-- Raphaël
--
Thanks,
Adi
Peter Nagy
2018-05-22 13:24:11 UTC
Permalink
Smells like an st patch
--
Peter Nagy

- To reach a goal one has to enjoy the journey
Post by Adrian Grigore
Maybe a nice thing to have would be to get the terminal emulator to
treat the field and record separator in a special way. So the programs
all output fs and rs, and the terminal emulator uses these characters to
layout the data in a tabular way.
There's no terminal that does this, right?
Hello,
On 21 May 2018 at 17:12, Adrian Grigore
This is a surprise. Where did you get usul from?
I sent the copy. I use usul regularly so I still have the whole repo
locally.
I have been wondering lately about resurrecting usul, since 'tabular
munging' with Unix utilities without the ability to set $FS and $RS
can be really unpleasant, and I've had to do quite a bit of it
recently.
I did also write a program to complement usul which performs elastic
tabbing on its input, the idea being that you end up with a nice
tabular view in your terminal. I think ideally you might want 'real'
terminal support for it though, as when printing to the tty you
either
need to buffer a lot of data (i.e. until a line with no FS), or you
need to do crazy tty-rewriting ANSI escapes, which it did support
but
is a massive hack.
In what way is the elastic tabbing different from what Plan9's mc(1)?
Ciao,
-- Raphaël
Raphaël Proust
2018-05-23 00:55:43 UTC
Permalink
Hi,
Post by Adrian Grigore
Maybe a nice thing to have would be to get the terminal emulator to
treat the field and record separator in a special way. So the programs
all output fs and rs, and the terminal emulator uses these characters to
layout the data in a tabular way.
There's no terminal that does this, right?
None that I know of.


There is https://github.com/unconed/TermKit which has “Rich output for
common tasks and formats, using MIME types + sniffing”. In practice,
this means that if you `cat foo.png` it'll show up as an image and if
you `cat foo.html` it'll show the output with syntax colouring.

But it's not a text-based terminal emulator: it's written in javascript
and runs on top of a chromeless browser.


I think it should be possible to have some level of “rich output”
without ruining all the benefits of a simple and consistent text-based
system. Tabulating output based on fs/rs would be one thing.

OTOH, you might as well just pipe the output through a layout program
that transform all the fs/rs into characters appropriate for visual
inspection.


Bye,
-- Raphaël

Connor Lane Smith
2018-05-22 13:44:00 UTC
Permalink
Post by Raphaël Proust
I sent the copy. I use usul regularly so I still have the whole repo
locally.
Could you send me a copy as well? I'd also be interested to know what
sort of things you tend to use it for, in case it could be made
better.
Post by Raphaël Proust
In what way is the elastic tabbing different from what Plan9's mc(1)?
foo
bar
baz
foo bar baz
foo bar baz
dis establishment arianism
foo bar baz
dis establishment arianism
I've uploaded the source [1], but I'd not touched it in 3 years so I'm
not really sure what state it's in. There are a few things I'd planned
to add, like options for not elasticising leading tabs, and
right-aligning numeric fields, which I might get around to if I feel
like it.

[1]: https://github.com/cls/elastic

Thanks,
Connor
Silvan Jegen
2018-05-22 16:03:39 UTC
Permalink
Post by Raphaël Proust
I sent the copy. I use usul regularly so I still have the whole repo
locally.
[...]
Post by Raphaël Proust
foo bar baz
dis establishment arianism
foo bar baz
dis establishment arianism
This sounds like 'column'[0].


Cheers,

Silvan

[0] https://www.freebsd.org/cgi/man.cgi?query=column&sektion=1
Connor Lane Smith
2018-05-22 16:33:12 UTC
Permalink
Post by Silvan Jegen
This sounds like 'column'[0].
It's similar to column -t, except that it handles varying field counts
in a similar way to gofmt, and it can use ANSI escapes to rewrite the
output so it can stream without buffering all (or any) input or output
up front.

Thanks,
Connor
Adrian Grigore
2018-05-21 18:57:34 UTC
Permalink
Try putting the library at the end. Some linkers display rather...
classic behavior when linking statically (i.e. only linking in the files
that are needed, but if you name a library as first thing, then no file
is needed at that point).

Works!
cc -lutf -o cat cat.o util.o
cat.c:(.text+0x179): undefined reference to `chartorune'
cat.c:(.text+0x1dd): undefined reference to `runetochar'
cc: error: linker command failed with exit code 1 (use -v to see invocation)
gmake: *** [Makefile:15: cat] Error 1
Try putting the library at the end. Some linkers display rather...
classic behavior when linking statically (i.e. only linking in the files
that are needed, but if you name a library as first thing, then no file
is needed at that point).
Ciao,
Markus
--
Thanks,
Adi
Loading...